Monday, October 31, 2011

Declarative vs. Procedural and QueryHealth

A couple of weeks ago and the S&I Framework Face to Face, the Query Health Technical workgroup was asked to pick an implementation to focus on.  I liked hQuery, but one of its challenges is that its queries are defined procedurally (in JavaScript).  In chatting with Marc Hadley, one of the developers, it became pretty clear that they are using "template" programming to generate the procedural code.

In my update on Query Health from the Face to Face Rich Elmore asks:
Would be interested in your thoughts on the differences between procedural and declarative approaches as outlined in the hQuery Summer Concert presentation (see charts 15 - 16 http://wiki.siframework.org/file/view/hQuery+Summer+Concert+Presentation.pdf )
One of the tasks I took on last week was to start looking at how to implement the query declaratively, using the HQMF format.  The slides that Rich is referring to compare this JavaScript:

function population(patient) {
  return (patient.age(start)>=64);
}

To this chunk of (reported to be) HQMF:

<entry typeCode="DRIV">
  <observation classCode="OBS" moodCode="EVN.CRT"
    isCriterionInd="true">

    <id root="4AAEF95D-DCC6-459C-839C-C820DF310D60"/>
    <code code="ASSERTION" codeSystem="2.16.840.1.113883.5.4"/>
    <value xsi:type="CD" code="IPP"
      codeSystem="2.16.840.1.113883.5.1063"
      codeSystemName="HL7 Observation Value"

      displayName="Initial Patient Population"/>
    <sourceOf typeCode="PRCN">
      <conjunctionCode code="AND"/>
        <act classCode="ACT" moodCode="EVN"
          isCriterionInd="true">

          <templateId root="2.16.840.1.113883.3.560.1.25"/>
          <id root="52A541D7-9C22-4633-8AEC-389611894672"/>
          <code code="45970-1" displayName="Demographics"
            codeSystem="2.16.840.1.113883.6.1"/>
          <sourceOf typeCode="COMP">
            <observation classCode="OBS"
              moodCode="EVN" isCriterionInd="true">

              <code code="2.16.840.1.113883.3.464.0001.14"
                displayName="birth date HL7 Code List"/>
              <title>Patient characteristic: birth date</title>
              <sourceOf typeCode="SBS">
                <pauseQuantity xsi:type="IVL_PQ">
                  <low value="64" unit="a" inclusive="true"/>
                </pauseQuantity>
                <observation classCode="OBS" moodCode="EVN">
                  <id 
                   root="F8D5AD22-F49E-4181-B886-E5B12BEA8966"/>
                  <title>Measurement period</title>
                </observation>
              </sourceOf>
            </observation>
          </sourceOf>
        </act>
      </sourceOf>
  </observation>
</entry>


Admittedly, the XML is hideously ugly.  I took my own crack at generating it because frankly, I could barely understand what the above said.  And for good reason, because it doesn't actually follow HQMF.  For one, the data criteria go into the "Data Criteria Section".  


This is an example of the data criteria that the patient is 64 years old or older:
<observation classCode="OBS" moodCode="EVN.CRT">
  <id root="42e2aef0-73c4-11de-8a39-0800200c9a66"/>
  <code code="424144002" codeSystem="2.16.840.1.113883.6.96" 
    displayName="Age"/>
  <value xsi:type="IVL_PQ">
    <low value="64" unit="a" inclusive="true"/>
  </value>
  <participant typeCode="SBJ">
    <role classCode="PAT"/>
  </participant> 
</observation>

That's actually pretty easy (for me) to interpret, but could be a little easier (or perhaps the right word is "greener").  I borrowed some of the Green C32 Schema for result and recast it with a slightly different set of assumptions:


<resultCriteria>
  <resultID root="42e2aef0-73c4-11de-8a39-0800200c9a66"/>
  <resultType code="424144002" codeSystem="2.16.840.1.113883.6.96" displayName="Age"/>
  <resultValue>
    <physicalQuantityInterval unit="a" low="64" />
  </resultValue>
</resultCriteria>


Hmm, arguably not much better, and I don't see the patient referent.

Another thing that bugs me in the XML is that the identifiers for the data criteria are useless for content creators (they use OIDs or GUIDs, which are impossible to remember).  If you create a criterion for patients over 64 years of age in one place, and want to reference it from somewhere else WITHIN the same XML, ideally, you'd use ID and IDREF with meaningful names.

In the above example, it might appear as:

<resultCriteria ID='patientsOlderThan64'>
  <resultType code="424144002"
codeSystem="2.16.840.1.113883.6.96" displayName="Age"/>
  <resultValue>
    <physicalQuantityInterval unit="a" low="64" />
  </resultValue>
</resultCriteria>


I also looked at the same set using the S&I Framework Clinical Information Model and SQL.  Here is a sample query that selects all the PatientInformation rows that qualify:

SELECT * FROM PatientInformation WHERE DATEADD(Year, 1, DateOfBirth) < GETDATE()

OK, so I cheated a little. I used the date of birth and computed the age. This is what most systems would probably need to do anyway [Note: One alternative for this one would be for the system to pre-compute a "virtual" observation of the age based on the patient's birth date.]

I'm still not satisfied with any of these declarative formats.  The SQL isn't quite right.  It selects from a table of patient Information, when what I really want is a table of patient identifiers.  The HQMF is still a little too dense.

And the procedural stuff won't fly in the real world, simply because it requires a particular implementation technology (JavaScript) that won't be readily applicable to all systems.  I know that there are JavaScript implementations for just about everything, but procedural is just the wrong way to go here.  It doesn't optimize for the ways that different systems want to work unless they all happen to use map/reduce, which isn't the case.

I'll have to spend more time on this tomorrow looking at it from a different perspective.

2 comments:

  1. A couple of big issues I see with HQMF have to deal with items such as code sets and templates. If you look at the example in the presentation pdf you linked to there is the section comparing the numerator of both hQuery and HQMF. Both have to deal with what is considered to be valid representations of what is Pneumococcal Vaccination. I believe the information for this may be stated in the NQF measure. hQuery embeds the codes , HQMF gives you an OID.

    CDA specs also have a penchant for using templates with those templates being declared in prose in some documents somewhere. Anyone having dealt with the CDA and template identifiers like this know that what is declared in those documents has ramifications on what is in the data structure they are used in.

    Now, couple these 2 items together and my guess is that code sets and all kinds of other information will be stated in the documents that describe the templates. At this point your in CDA land and any real hope of having something that can generically compile an arbitrary HQMF document is gone. If I have to read through an HQMF document and figure out where various things for code sets are defined in order to implement the query then it's a miserable failure.

    ReplyDelete
  2. See my next post for a discussion on value sets. These things are reusable resources that should be harmonized across quality measures, and so should be referenced in the measures. Ideally, there should be a reference to the value set.

    ReplyDelete