Tuesday, December 31, 2013

On Codes

Recently, a request came in to change some of the codes for an HL7 Vocabulary (specifically the < and > codes in ObservationInterpretation). The complaint was that these characters must be escaped in XML.  

This spawned the usual deluge of e-mails about proper ways to generate identifiers for codes.  The best practice according to the erudite vocabularists is to use meaningless (semantically void) code values.  This, everyone agrees, is the best practice for managing code systems.  Well sure it is, when you have thousands of codes and nobody is ever expected to have to interface with the code system directly through code values.  But when you have a small code set, or one like ICD where people are expected to interface with these identifiers directly, mnemonic, or at least easy to remember code identifiers make sense, at least for the people who have to use them.

The only absolute I ever learned that hasn't failed me yet is that there are no absolutes.


P.S.  I've never had to escape those codes when creating XML, or any other special character for that matter.  If you have to worry about XML syntax in your code, you are doing it wrong.  Use a tool, don't write it manually.


  1. Use a tool, and when enough people use that tool, the tool will improve. Something I have noticed in passing is that tooling and standards develop in parallel, and there is a time lag between SDO work, Recommended status and tooling support, which leads to misalignment between XML tooling and HL7 specific tooling, for instance. I'm not sure this will get better or worse with JSON added to the mix.

    This is not just misalignment between XML and HL7 v3 tooling; for instance XSLT 3.0, XPath 3.0 and XSD 1.1 have arrived now, but we will continue to use whatever is supported in the browser or by default in our adaptation layer or XML editors, which is short-sighted.

  2. So, what is going on here? Are some people just writing markup and strings to a stream, and complaining about the necessity of properly escaping the content of the strings? Haven't they learned to use one of the excellent general-purpose XML libraries available, let alone HL7-specific tooling?

    This reminds me of discussions a few years ago between Tim Berners-Lee and people at Google, notably Peter Norvig concerning the difficulties in realizing the Semantic Web. The Google partisans reminded Berners-Lee that they deal on a daily basis with webmasters that don't know how—or can't be bothered—to configure a web server properly.

    And then there's Cory Doctorow's famous essay Metacrap: Putting the torch to seven straw-men of the meta-utopia. Whether or not you completely buy into his cynicism (or simple realism) about the difficulties, the psycho-socio-economic facts of the matter are impossible to ignore.

    Of course, as a societal phenomenon this is not so different than problems that have plagued the practice of medicine and public health since they began, both internally (e.g., precautions against infection in obstetrics and surgery) and externally (e.g., inadequacies of public sanitation, avoidable health risks in the workplace, air and water pollution).