Friday, June 29, 2012

Continuous Variable Measures - The Final Solution?

We are honing on on a model for how to perform Counting in HQMF and QueryHealth, which I discussed earlier this week.  I had been working on the Harmonization proposal for dealing with time relationships, but that got messy.  It turns out there is a whole algebra for dealing with intervals, and the more I looked at it, the closer I got to having my brain explode.

So, I'll go back to that after dinner (because I promised that tonight), and finish up this because it is also relevant for harmonization (and my brain will remain intact).

Essentially what we are proposing is a new <MeasureCriteria> element similar to <PopulationCriteria>, <NumeratorCriteria>, and <DenominatorCriteria>.  That element will define the SET of objects over which an aggregation computation is performed.  The computation will be defined within a <MeasureObservation> element that references the <MeasureCriteria> element.

My first crack at how <MeasureObservation> would look like was this:


<MeasureObservationDefinition>
  <id .../>
  <code code='aggregationFunction ' codeSystem='...'/>
  <derivationExpr>computation</derivationExpr>
  <sourceOf typeCode='DRIV'>
    <localVariableName>variableNameInComputation</localVariableName>
    <measurePopulationReference>
      <id .../>
    </measurePopulationReference>
  </sourceOf>
</MeasureObservationDefinition>

Inside, the computation could use Simple Math or another expression language, and the expression could be based on a specific <measurePopulationCriteria> defined in the <PopulationCriteriaSection>.

As I looked at this though, I wondered if I really even needed to reference the <measurePopulationCriteria>, because when it is defined, it already has a <localVariableName>.  Would it not in fact be simpler to just say:

<MeasureObservationDefinition>
  <id .../>
  <code code='aggregationFunction' codeSystem='...'/>
  <derivationExpr>computation</derivationExpr>
</MeasureObservationDefinition>

In the HQMF, the local variable for the <measurePopulationCriteria> element is already defined earlier in the same document, and thus could be inferred from the document context.  Channeling Marc's annoyance with unnecessary XML (and my own I might add), I got rid of it.

So the remaining piece here was to define the aggregation functions allowed in this act.  The ones that I can come up with include COUNT, AVERAGE, SUM, MIN, MAX, MEDIAN and MODE.  Everything else can be computed from these.  In fact, AVERAGE is readily computable from SUM and COUNT, but it is done often enough to merit inclusion in the set.  Another issue here though is that we often want to compute several of these results, for example, the AVERAGE, and the range (MIN and MAX), and we might also want to compute STDEV and VARIANCE.  Unfortunately, I can only associate one <code> with the observation.


After a bit of digging around, what I realized was that what I was doing was aggregating, and applying one or more methods during the aggregation.  So now I have a slight variation on the previous model, where code is fixed to AGGREGATE (a new value that I now need to add to ActCode), and the clone name is changed to reflect what this has become.  I've added methodCode to indicate what aggregation methods are to be used, and you can repeat it to indicate that you want to use more than one (e.g., AVERAGE, STDEV and COUNT).

<measureAggregateDefinition>
  <id .../>
  <code code='AGGREGATE' codeSystem='2.16.840.1.113883.5.4'/>
  <derivationExpr>computation</derivationExpr>
  <methodCode code='aggregationFunction' codeSystem='...'
    codeSystemName='ObservationMethodAggregate'/> (...)
</measureAggregateDefinition>

Now the only thing left to do was define the value set for ObservationMethodAggregate (a value set I just made up to appear in ObservationMethod).  As I was going through this list, I realized that I either needed to define how standard deviation and variance are computed (over a population or a sample), or allow for both methods.  I figured it would be easier to include both, providing greater clarity about what was meant in each code.

CodePrint NameDefinition
COUNTCountCount of non-null values in the referenced set of values
SUMSumSum of non-null values in the referenced set of values
AVERAGEAverageAverage of non-null values in the referenced set of values
STDEV.SSample Standard DeviationStandard Deviation of the values in the referenced set of values, computed over a sample of the population.
VARIANCE.SSample VarianceVariance of the values in the referenced set of values, computed over a sample of the population.
STDEV.PPopulation Standard DeviationStandard Deviation of the values in the referenced set of values, computed over the population.
VARIANCE.PPopulation VarianceVariance of the values in the referenced set of values, computed over the population.
MINMinimaSmallest of all non-null values in the referenced set of values.
MAXMaximaLargest of all non-null values in the referenced set of values.
MEDIANMedianThe median of all non-null values in the referenced set of values.
MODEModeThe most common value of all non-null values in the referenced set of values.

Fortunately, if my memory serves, that leaves me with nothing more to do on this topic (given that I've already updated the R-MIM in the Visio Diagram to support this).

So, how does this address continuous variable measures.  Let's take a simple example:  Average ED visit time.  This is pretty straight-forward.

  1. Define a measure over encounters (recall that you need to specify this in the measureAttribute element of the QualityMeasureDocument).
  2. Create an encounterCriteria that selects only ED encounters in the dataCriteriaSection.  
  3. Now create a measurePopulationCriteria with the localVariableName EDVisits in the populationCriteriaSection.
  4. Finally, create the measureAggregateDefinition in the measureObservationsSection, and add this XML inside it: 
  5. <derivationExpr>EDVisits.effectiveTime.high - EDVisits.effectiveTime.low</derivationExpr>

We'll need to indicate that when AGGREGATE is computed, and no aggregation method is specified, that the implementation can determine what it does, and suggest that it at least compute the COUNT and SUM, but may produce other aggregate statistics.

Am I done with this?  Probably not, but hopefully enough to get us through the ballot.

OK, off to dinner and then back for some more stuff.

0 comments:

Post a Comment