Friday, December 11, 2009

UTC in HL7 Version 3

The timestamp data type is used in a variety of standards to mark the time at which an event occured.  Most standards (including HL7 Version 3 and W3C XML Schema) rely on ISO 8601 as the base standard which is then constrained in different ways.

Marc de Graauw asked a question about how one would represent Universal Coordinated Time using the HL7 Version 3 standards (see How to express UTC time in TS).  I did a bit of research on this and was somewhat amused with my findings:

The HL7 V3 Datatypes schema allows [0-9]{1,4} for the pattern following the + or - so that doesn't help much.

Section 4.2.5.1 of ISO 8601 states:
When it is required to indicate the difference between local time and UTC, the representation of the difference can be expressed in hours and minutes, or hours only. It shall be expressed as positive (i.e. with the leading plus sign [+]) if the local time is ahead of or equal to UTC of day and as negative (i.e. with the leading minus sign [–]) if it is behind UTC of day. The minutes component of the difference may only be omitted if the time difference is exactly an integral number of hours.

The key phrase ahead of or equal to UTC indicates that +00 or +0000 are the only ways to represent UTC other than Z. I know that zero is neither positive or negative but those terms are in reference to the leading + or - sign. The statement "equal to UTC" is what makes the point, which means that -0000 isn't valid (according to 8601).

Standards using 8601 disagree: 
The W3C use of 8601 in XML schema recognizes +00:00 -00:00 and Z as legal representations of UTC, with Z being the canonical representation. See http://www.w3.org/TR/xmlschema-2/#dateTime

Abstract Datatypes Release 1 and 2 say pretty much the same thing for the literal form of a time stamp:
In the modern Gregorian calendar (and all calendars where time of day is based on UTC), the calendar expression may contain a time zone suffix. The time zone suffix begins with a plus (+) or minus (-) followed by digits for the hour and minute cycles. UTC is designated as offset "+00" or "-00"; the ISO 8601 and ISO 8824 suffix "Z" for UTC is not permitted.

The ITS: XML Datatypes, Release 1 specification has nothing to say other than by reference to Abstract Data types.

Pragmatically, any user of HL7 V3 schemas should recognize any of +0, -0, +00, -00, +000, -000, +0000 and -0000 as a UTC time zone, but should only record UTC as +00 or +0000 (my own preference). These are all legal representations of time zones using the HL7 TS data type according to the (non-normative) schemas provided by the XML ITS.
So, there you have it.
 
Keith
 
P.S.  This is book fodder...

4 comments:

  1. Keith,

    I like your blog article -- succint and correct as usual! But there are still a few holes ...


    CASE 1: NTP time (where local timezone is unknown) -- maybe we should use the "Z" suffix for that, implying a globally correct NTP timestamp referenced to UTC but where the local time zone is unknown.

    CASE 2: Time and timezone are both known, and you are in London, in which case you would use +00 or +0000.

    We need to unambiguously encode the two cases above.

    Another thing that is bugging me is why leap seconds are prohibited in W3C Schema (where you can have a 60th second in a minute, rather than the usual 0-59 seconds in a minute). W3C Schema does not allow a 60th second, whereas 8601 and NTP both support a 60th second in the minute when a leap-second is inserted. In other words, W3C Schema is incapable of representing these points in time that actually do exist in a purely isochronous timeline such as atomic clock time (e.g. TAI, GPS, etc.). [Maybe this could be fixed in W3C Schema 1.1]

    Best regards,

    Paul Schluter
    GE Healthcare

    ReplyDelete
  2. AFAIK (in HL7 datatypes R1) Timezones always have to be specified using 4 digits.

    ReplyDelete
  3. The XML ITS says: In short, the syntax is "YYYYMMDDHHMMSS.UUUU[+|-ZZzz]" where digits can be omitted from the right side to express less precision.

    It doesn't specify where digits can be omitted from (the time or time zone). ISO 8601 allows them to be omitted from either. Datatypes-base.xsd that ships with CDA allows for 1 to 4 digits of time zone to be specified.

    Best practice would be to use all four, but it appears to me that the standard allows for fewer.

    ReplyDelete
  4. Note that RFC 2822 (the email standard) uses timezone offset -0000 to indicate local time.

    DICOM requires a 4 digit timezone offset, and forbids -0000.

    ReplyDelete