- Determine what the average or total value is for some number of observations for a patient (e.g., what is the patient's average blood pressure over the measurement period).
- Counting events (how many times was the patient immunized for flu, or given a foot examination).
- Specifying criteria based on these computations (find patients which have had at least two visits).

I'd already dealt with min/max/first/most recent in last week's post. Counting, computing averages and totals is not terribly hard to implement, but representing the semantics in a way that can be acted on seemed to be a bit complicated from my reading of the appropriate RIM specifications. Turns out it required a bit of decoding of terms.

There are defined terms used for the subsetCode element that support MIN/MAX and FIRST/MOST RECENT, but there didn't appear to be codes to handle total, count and average. The subsetCode element is defined in the RIM as filtering the relationship to the min, max, first, last, most-recent, total or average, but the ActRelationshipSubset vocabulary doesn't include any terms that meant total or average to me. It does include different forms of SUMMARY (including past summary and future summary). Then I recalled that total, average and count are

__summary__statistics. I'm going to choose to use that interpretation in how I deal with it in the transformation, and after further reading (on repeatNumber), that turns out to be the correct interpretation.
So, a precondition that asks for a summary of acts meeting a particular criteria will act as if it returns a single act that summarizes the results of the selected acts. The key to understanding what a summary looks like is found on the repeatNumber attribute of the Act class. If you want a count of the number of times something occurred, look for it in repeatNumber.

"On an Act in Event mood, the repeatNumber is usually 1. If greater than 1, the Act represents a summary of several event occurrences occurring over the time interval described by effectiveTime.These occurrences are not otherwise distinguished."

The effectiveTime attribute will be an interval that is the smallest interval containing all intervals of the summarized acts (this is known as the convex hull).

It's really hard to put a code on this summary item. The code for a measurement usually implies a particular method of measurement, and that method can also include details about "average over 8 hour period", et cetera. So, we cannot use the original code for the item. Ideally, there would be a way to go from code for X in one code system to another code that indicated the "average for X" in that same code system, but in reality, you would need a Clinical Terminology Service to handle that for you. But it doesn't seem practical to assume that every implementation of Query Health is going to have a service that supports that sort of capability.

You'd think that not being able to "distinguish" summary observations from others would make them difficult to use because you couldn't get to them by their code. That turns out to not be a problem. The summary item is specified as a criterion, and is given a local variable name, so processing can address the items by name.

So, I went back and fixed my implementation on NQF Measure 59, so that it now correctly looks for more than one ambulatory encounter in the time period by adding the lines in bold below:

<entry typeCode="COMP">

<localVariableName>AmbulatoryEncounter</localVariableName>

**<subsetCode code="PREVSUM"/>**

<encounter classCode="ENC" moodCode="EVN" isCriterionInd="true">

<id root="0" extension="AmbulatoryEncounter"/>

<code valueSet="2.16.840.1.113883.3.464.1.1142"/>

<effectiveTime>

<high nullFlavor="DER">

<expression>EndDate.add(new PQ(-2,"a"))</expression>

</high>

</effectiveTime>

**<repeatNumber>**

**<low value="2" inclusive="true"/>**

**</repeatNumber>**

<sourceOf typeCode="INST">

<observation classCode="OBS" moodCode="DEF">

<id root="0" extension="Encounter"/>

</observation>

</sourceOf>

</encounter>

</entry>

Then I modified the hQuery transformation to handle count and average by changing the processing for preconditions. Instead of generating this:

var denominator = function(patient) {

return atLeastOne(allOf(HasDiabetes.length != 0, true) && atLeastOne(EDorInpatientEncounter.length != 0, AmbulatoryEncounter.length !=0, false) , DiabetesMedAdministered.length != 0, DiabetesMedIntended.length != 0, DiabetesMedSupplied.length != 0, DiabetesMedOrdered.length != 0, false) ;

}

It now generates this:

var denominator = function(patient) {

return atLeastOne(allOf(HasDiabetes.length != 0, true) && atLeastOne(EDorInpatientEncounter.length != 0,

**AmbulatoryEncounter.length >=2**, false) , DiabetesMedAdministered.length != 0, DiabetesMedIntended.length != 0, DiabetesMedSupplied.length != 0, DiabetesMedOrdered.length != 0, false) ;
}

The trick was to check to see if there was a constraint on repeatNumber in the act, and if so, to apply it to the length of the returned set of acts.

For the XQuery implementation, I had deferred the matching logic to an XQuery function (as yet unspecified) that performs the work based on the XML in the criterion, and the subsetCode. I needed do nothing there (other than to create the missing matches function).

Finally, in the SQL Implementation, I needed to turn this:

SELECT DISTINCT PATIENTID FROM Encounter

WHERE (

CODE IN (SELECT CODE FROM VALUESETS

WHERE ID = '2.16.840.1.113883.3.464.1.1142') AND

EFFECTIVETIME < DATEADD(yyyy, -2, @EndDate)

)

Into this:

SELECT DISTINCT PATIENTID FROM (

SELECT PATIENTID, COUNT(ID) AS COUNT FROM Encounter

WHERE (

CODE IN (SELECT CODE FROM VALUESETS

WHERE ID = '2.16.840.1.113883.3.464.1.1142') AND

EFFECTIVETIME < DATEADD(yyyy, -2, @EndDate)

)

GROUP BY PATIENTID

)

WHERE COUNT > 2

The place where this gets tricky is when I need an average value (for observations). I have to assume (for now) that for all of these implementations the values are normalized to a particular set of units (e.g., blood pressure is in mm of mercury, temperature is in degrees F (or C), et cetera. That means that a single system cannot have height measured one day in inches and another in meters. For an example, let's assume that the measure said "average" HbA1C measure is greater than 9%, instead of the most recent. Now I have to compute the average, which means turning this:

SELECT DISTINCT PATIENTID FROM LabResults

WHERE (

CODE IN (SELECT CODE FROM VALUESETS

WHERE ID = '2.16.840.1.113883.3.464.1.72') AND

VALUE > 9 AND UNITS = '%'

)

Into this (but only when the subset is one of the summary codes):

SELECT DISTINCT PATIENTID FROM (

SELECT PATIENTID, AVERAGE(VALUE) AS VALUE FROM LabResults

WHERE (

CODE IN (SELECT CODE FROM VALUESETS

WHERE ID = '2.16.840.1.113883.3.464.1.72')

AND UNITS = '%'

)

GROUP BY PATIENTID

)

WHERE VALUE > 9

This is doable. What it means is that I have to separate the VALUE criteria from the rest of the selection criteria so that I can generate the appropriate query. The further I go with this, the more that I realize that SQL really is ugly ;-) and that XSLT as a transformation language works for generating SQL works, but isn't best. I'll stick with it though, because it's great for the XQuery implementation, and very usable for hQuery, and I get a lot out of using the same transformation language across the three implementations.

Having figured this out, I'm ready to do the correlation study. Remember, I want to correlate incidence of flu with vaccinations and segregate by age. As it turns out, this is too easy. I can code this query as two measures for each age range. The initial patient population selects patients by age range. The denominator for the first measure identifies vaccinated patients, and the denominator for the second measure identifies un-vaccinated patients, and the numerators select patients who got the flu. But that's cheating, since it basically means that the query has to specify each of the "cells" completely.

What I really want is an easy way to specify these classifications in the measure and have the query transformation automatically compute the outer-product of the classifications for which results should be reported, and apply them to the measures. I'm going to have to think about that some more. It would seem that this would be an expansion on the way that the initial patient population is described in HQMF, so I'll have to think about how I want to express that.