Convert your FHIR JSON -> XML and back here. The CDA Book is sometimes listed for Kindle here and it is also SHIPPING from Amazon! See here for Errata.

Thursday, November 3, 2011

When the XML Sucks

One of the complaints about HL7 is that their XML sucks.  One of the complaints about ebXML RIM specifications is that their XML sucks.  One of the points of GreenCDA is to get to less suckful XML in the content.  There are a lot of other XML schemas that "suck".

The evaluation of "Sucks" is a subjective measure, and is often an assessment made through the eyes of developers who have not yet drunk a particular flavor of kool-aid, be it HL7 RIM, ebXML RIM, whatever.  For most problems, the requirements imposed by model oriented XML schemas are complex enough that in fact, the XML does suck for the average developer.  Few real world problems are truly simple.  Simple XML can probably solve 80% of the problem, but then we get into the dicey parts, and well, that's where solutions (and XML) gets complicated.

Hey, even HTML sucks, but we've learned to live with it.

A long time ago, in a place far-far-away, I used to write code in a programming language that almost but not-quite sucked.  But it had one very important feature, and that was a pre-processor that could write amazingly complex expressions in simple, easy to use terms.  It occurs to me that we could do the same with XML today, we just need to define the preprocessing mechanism.  Let's say that I wanted to take this:


<component>
  <section>
    <code code="57026-7" codeSystem="2.16.840.1.113883.6.1"/>
    <title>Population Criteria Section</title>
    <text>
      This section describes the Initial Patient Population, 
      Numerator, Denominator, Denominator Exceptions, and
      Measure Populations</text>
    <entry>
      <observation classCode='OBS' moodCode='EVN.CRT'>
        <id root="0"/>
        <code code="ASSERTION"
          codeSystem="2.16.840.1.113883.5.4"/>

        <value xsi:type="CD" code="IPP"
          codeSystem="2.16.840.1.113883.5.1063"/>

        <sourceOf typeCode="PRCN">
          <observation classCode="OBS" moodCode="EVN.CRT">
            ...
          </observation>
        </sourceOf>
      </observation>
    </entry>
  </section>
</component>

And turn it into this:


<PopulationCritieriaSection>
  <title>Population Criteria Section</title>
  <text>
    This section describes the Initial Patient Population, 
    Numerator, Denominator, Denominator Exceptions, and
    Measure Populations</text>
  <InitialPatientPopulation>
    <id root="0"/>
    <precondition>
      <observationCriteria>
       ...
      </observationCriteria>
    </precondition>
  </InitialPatientPopulation>
</PopulationCriteriaSection>

This is a pretty simple transformation of hard-to-read XML into something that is pretty readable.  How could I do this in a way that was "standards-compliant".  The keyword here is transformation.  I need some way to tell the receiver that it needs to transform the XML before using it.  Hey, there's even a standard for that!

What if, at the top of the document, I put in this little piece of XML:

<?xml:stylesheet href='http://myxmlisbetter.com/yxs.xsl' ?>

And then, if I included in that stylesheet, the following:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  version="1.0">
  <xsl:template match="/">
    <xsl:apply-templates/>
  </xsl:template>
  <xsl:template match='PopulationCritieriaSection'>
    <component>
      <section>
        <code code="57026-7" 
          codeSystem="2.16.840.1.113883.6.1"/>
        <xsl:apply-templates/>
      </section>
    </component>
  </xsl:template>
  <xsl:template match='InitialPatientPopulation'>
    <observation classCode='OBS' moodCode='EVN.CRT'>
      <xsl:copy-of select='id'/>
        <code code="ASSERTION"
          codeSystem="2.16.840.1.113883.5.4"/>
        <value xsi:type="CD" code="IPP"
          codeSystem="2.16.840.1.113883.5.1063"/>
        <xsl:apply-templates/>
    </observation>
  </xsl:template>
  <xsl:template match='precondition'>
    <sourceOf typeCode='PRCN'>
      <xsl:apply-templates/>
    </sourceOf>
  </xsl:template>
  <xsl:template match='observationCriteria'>
    <observation classCode="OBS" moodCode="EVN.CRT">
      <xsl:apply-templates/>
    </observation>
  </xsl:template>
  <xsl:template match="*">
    <xsl:copy-of select="."/>
  </xsl:template>
</xsl:stylesheet>


If your application didn't understand my better XML, but did understand the geeky original, it could simply regenerate the original XML by applying the transformation (and it does work, I tested it).  BUT, if it did understand the "Better" XML, it could just process that directly.  OK, so this is pretty much the way that "GreenCDA" works today, except that it doesn't put an XML stylesheet processing instruction right up front in the "green" document.

To go a little bit further, let's say I wanted to reuse some content over and over again.  In C (or later derivatives), I would say
#include "stdio.h"

In XML, I can say (again using a standard):
<xi:include href='mystuff.xml'/>


So, if you have to deal with XML that sucks, you can fix it up using this technique (at least before you have to release for production).  There are a bunch of challenges in using this technique that I haven't addressed.

  1. Governance:  Who gets to define transformations that are allowed to be used in an exchange.
  2. Retrieval and security issues: The transformation resources becomes referenced as part of the communication, but it's specification as a URL seems to imply that separate retrieval is required (this isn't necessarily the case, but I digress).  The real issue is that if you've gone to the trouble to set up a secure transport, and then throw this into the mix, it messes with things. BTW: There is a way to include the XSLT inline in the original XML using HL7 V3 extensions and XML Fragment identifiers
  3. Overkill:  For most purposes, XSLT is probably a little bit too powerful.  It certainly does the job, but there is probably  a better way to represent the mapping for 80% of the cases. (see above on solutions getting tricky after that first 80%).
Anyway, this was an interesting digression, and gives me an excuse to use simpler XML as I start working through Query Health examples.  If I do it right, I can always transform them back to compliant implementations, and who knows, maybe the improvements I come up with could be used to improve the standard.  After all, HL7 is revamping HQMF, which is great timing for Query Health.