Wednesday, November 23, 2011

"Greening" the HQMF

So my next project is making HQMF easier to create and read.  I started with an HQMF I wrote to support NQF Measure 59, Poor A1C Control.

The first set of steps simplify the document in ways that ensure compliance with the HL7 tooling.  Subsequent steps will "green" it in ways that current tooling don't support.

Named Sections
The first step was to name the required and optional sections in the XML.  I started with something like this:

      <code code="34089-3" codeSystem="2.16.840.1.113883.6.1"/>
      <title>Measure Description Section</title>
      <text>This is a description of the measure.</text>
And modified it to something like this:

      <title>Measure Description Section</title>
      <text>This is a description of the measure.</text>

The code attribute for the section can be fixed, and thus omitted from the XML, since it is implied by the section name.  So, I added MeasureDescriptionSection, DataCriteriaSection, PopulationCriteriaSection and MeasureObservationSection elements.

Component and Definition Relationships
Within the DataCriteriaSection, there were a lot of <sourceOf typeCode='COMP'> elements.  I reduced that to <component>.  There were also a lot of <sourceOf typeCode='INST'> elements which I reduced to <definition>

Within each <component> there were numerous criterion, expressed as HL7 acts:
<act classCode='ACT' moodCode='EVN' isCriterionInd='true'> 
<observation classCode='OBS' moodCode='EVN' isCriterionInd='true'> 
<supply classCode='SUP' moodCode='SUP' isCriterionInd='true'>
<substanceAdministration classCode='SBADM' moodCode='EVN' isCriterionInd='true'>
<procedure classCode='PROC' moodCode='EVN' isCriterionInd='true'>
<encounter classCode='ENC' moodCode='EVN' isCriterionInd='true'>

These were simplified (using defaults for classCode and moodCode) to:

Measure Parameters
Some of the data criteria were actually measure parameters.  These were in in event mood, but did not have isCriterionInd set to true.  These were always observations (the only HL7 Act having a value).  I realized also that these aren't just components, but are specifically control variables for the measure, and so there is a better act relationship: Has Control Variable.  In the HQMF XML, this would change typeCode from COMP to CTRLV, and in these cases, component became controlVariable.

So this:
      <entry typeCode="COMP">
        <observation classCode="OBS" moodCode="EVN">
          <code code="52832-3" codeSystem="2.16.840.1.113883.6.1"/>
          <value xsi:type="TS" value="20100101"/>
          <code code="52832-3" codeSystem="2.16.840.1.113883.6.1"/>
          <value xsi:type="TS" value="20100101"/>

Next were references to criteria and definitions.  The same XML was used as for components and definitions, but they included only a single <id> element that pointed to the actual act being referenced.  These became:

Precondition Conjunctions
Preconditions can be joined with conjunctions, specified in the conjunctionCode element.  There are three different types:  AND, OR, and XOR.  I added ANDprecondition, ORprecondition and XORprecondition elements which fixed the value of conjunctionCode to AND, OR and XOR respectively.

Named Criteria

Several of the criteria in PopulationCriteriaSection and MeasureCriteriaSection are identified using HQMF specified codes, including the initial patient population, the numerator, denominator, denominator exceptions, measure criteria, and classifers I created yesterday.So, I created named model elements to represent those, enabling me to fix (and thus drop) the code and value elements from these observation:

Everything I've done thus far is supported by the HL7 modeling tools (as far as I know), and most of it is consistent with the HL7 methodology.  At this stage, the HQMF file is reduced by about 10% by line count, and 30% by file size, just to give some metrics supporting how much simplification has occurred.

I suspect that I'll get a little bit of grief about "fixing" element values, but from a modeling perspective, it is certainly comprehensible.  The next set of changes go one step further, and are more in line with "Green" CDA.  In this, I start to combine act relationships and acts when it makes sense, and restructuring the XML in ways that is transformable to the HL7 representation, but is not consistent with the current methodology.

Local Variable Names
There are a double dozen (or more) variable names associated with each precondition.  Since these are simple strings, I just made the localVariableName a "name" attribute on the element to which it applied.  I can now make these be of the ID type to ensure that variable names were unique.

Act Relationships merged with Acts
Sections are always components of the document.  For the named sections, I dropped the component element, as it is implied by the named section element.

The observationParameter element always appears inside a controlVariable element.  So, I simply drop the observationParameter element and moved it's children inside the controlVariable element.

The various criterion elements are always components, so I moved the component children down into the criterion elements, and dropped the component element.

The definition elements are also implied by their content, so I dropped those.  All definition and criteria reference element just contain a single id element pointing back to the definition or critieria in the measure critiera section.  So, I shifted the id attributes (root and extension) to the reference element itself.  So this:

  <id root="0" extension="ageBetween17and64"/>
<observationCriterionRef root="0" extension="ageBetween17and64"/>

ID/IDREF or Global Identifiers?

I debated mentally with myself upon whether I should use ID/IDREF for these references or not.  HL7 act references are done by the instance identifier for the act, and these are globally unique.  When using  ID/IDREF the identifiers are only unique within the document.  Observation criteria references clearly point back inside the measure definition document, and ID/IDREF would allow the schema to ensure that the criteria point back to the appropriate thing.  But it wasn't clear to me whether that would also be true for definitions, because one could refer to a definition that was defined outside of the scope of the HQMF document.  If I shifted to IDREF for pointing, the definition reference could just be a single attribute on the precondition in which it appeared.  Since I was using local variables extensively, I realized that there was actually a lot of value to this.  Then I also realized that there is a URI representation of the II data type, so I could actually use key and keyref to support BOTH, and that decided it for me.

I had a couple cases where I had an act precondition that simply served as a grouper for other preconditions.  In that case, clearly I could drop that act precondition, so I did.

For others, what I realized was that a precondition was either a reference to an atomic criterion, or a collection of other preconditions using the AND/OR/XOR conjunctions.  All the preconditions of the same type could have their references merged into a single precondition element.  Since the criteria could be referenced using IDREFS, I could also merge them via an attribute.  But I forgot about negation on preconditions.  So, I crossed precondition types with negationInd on precondition and came up with the following new names:

It was tempting to use IDREFS with these values as attribute names, but I needed to be able to combine preconditions [e.g., to handle (A AND B) OR (C AND NOT(D))], so I left the preconditions as elements.

I found a number of cases where I had a single <AllTrue> element appearing with a single criteria reference inside it.  That wasn't worth maintaining, so I dropped the wrapping element.

In the Classifier criteria entries, the criteria would always show up inside an <OnlyOneTrue> element.  I could also safely drop that.

Definition references were singlular in each criterion (when present), so I could drop the reference and use a definition attribute to point the applicable model definition defined in the measure, following the same pattern as criteria references.

In shifting to this key/keyref strategy, the definition reference element names were dropped in favor of a single definition attribute on the criterion.

The DenominatorExceptionCriteria now looks like this:

      <observationCriterionRef ref="HasPolycysticOvaries"/>
        <observationCriterionRef ref="HasDiabetes"/>
    <observationCriterionRef ref="HasSteroidInducedDiabetes"/>
    <observationCriterionRef ref="HasGestationalDiabetes"/>

Either the patient has PolycysticOvaries and not Diabetes, or they have Steriod Induced or Gestational Diabetes.  That's a lot simpler to read than:

<observation classCode="OBS" moodCode="EVN" isCriterionInd="true">
  <id root="c75181d0-73eb-11de-8a39-0800200c9a66"/>
  <code code="ASSERTION" codeSystem="2.16.840.1.113883.5.4"/>
  <value xsi:type="CD" code="DENEXCEP" codeSystem="2.16.840.1.113883.5.1063"
    codeSystemName="HL7 Observation Value" 
    displayName="Excluded from Denominator"/>
  <sourceOf typeCode="PRCN">
    <conjunctionCode code="OR"/>
    <act classCode="ACT" moodCode="EVN" isCriterionInd="true">
      <sourceOf typeCode="PRCN">
        <conjunctionCode code="AND"/>
        <observation moodCode="EVN" classCode="OBS" isCriterionInd="true">
          <id root="0" extension="HasPolycysticOvaries"/>
      <sourceOf typeCode="PRCN" negationInd="true">
        <conjunctionCode code="AND"/>
        <observation moodCode="EVN" classCode="OBS" isCriterionInd="true">
          <id root="0" extension="HasDiabetes"/>
  <sourceOf typeCode="PRCN">
    <conjunctionCode code="OR"/>
    <observation moodCode="EVN" classCode="OBS" isCriterionInd="true">
      <id root="0" extension="HasSteroidInducedDiabetes"/>
  <sourceOf typeCode="PRCN">
    <conjunctionCode code="OR"/>
    <observation moodCode="EVN" classCode="OBS" isCriterionInd="true">
      <id root="0" extension="HasGestationalDiabetes"/>


There are some common patterns for representing the medication participant in substanceAdministration and supply criteria.  The participant is either a product or consumable, and the role is therapeutic substance or manufactured material.  For the most part, what we care about is the code.  So, I've simplified both of these to <medication> and moved the attributes of the code on the entity up to that element.

Thus far, the greening reduces the line count by about 55% and file size by 65%.  You can get to the before, middle and after examples in this zip file (on Google Docs).

Additional Work
There are a couple more refinements I'd make here, but these are just thoughts that I haven't executed on yet:

<value> elements in the criteria have to specify a type.  That's error prone in implementations because type names have to be specified by namespace.  I'd prefer to see element names like valueTS, valuePQ and valueCD so that implementations don't have to check type in complex ways.  We don't need every data type as a choice for this, because CE derives from CD, et cetera.  In criteria, valueTS and valuePQ would use the IVL_TS and IVL_PQ data types respectively, because criteria specify boundaries.

IVL_PQ has a couple of different ways to specify the units.  From a best practices perspective, I hate intervals where the lower bound is specified in a different unit than the upper bound.  It makes for error prone implementations.  So, I'd drop the unit from the high/low components of the IVL_TS type, and shift it to the parent element, forcing implementers to use a single unit attribute.

I note that expressions of time in the queries are often in relationship to the start or end date.  There's some more opportunity for simplification there.

Also, the fact that expressions have to have a nullFlavor="DER" attribute is another place where simplification can occur. And I still need to deal with expression syntax.

And then I need to show that the same implementations I've done previously will run over this XML (which they will, I just need to prove it).


  1. Keith,
    Nice work on this!
    What can be done about the aspects of the HQMF spec that substitute unformatted text for structure XML -- use of derivationExpr, and the use of english language in the header to describe aggregation, etc?


  2. I recently opened a bug report referring to this blog post, here:

    Apparently, the HQMF files being distributed by USHIK resemble the starting format described here more than the actual final format described by R2.

    Any pointers or suggestions about how to handle this would be gratefully appreciated.