Saturday, June 30, 2012

An Odd-Essay through Time

I've been working on trying to clean up HL7's temporal vocabulary for HQMF.  The problem is with boundary conditions as it is relevant to time relationships.  We have two boundaries, the start and end of an event, and three comparison operations (less than, equal to and greater than), which gets me to twelve different vocabulary terms, which gives me all the atoms needed, right?

Wrong.  We are dealing with intervals, and intervals can be open or closed. While < and = together make up <=,  we cannot use two vocabulary terms to specify a SINGLE relationship.  So in order to describe the <= relationship between two items, we actually need to have five different comparators, <, <=, =, >= and >.  This is still NOT a closed set of relationships over the inverse operator, because we are missing <> (or != if you prefer that notation).

So we now have two choices from the source of the relationship, either the start or the end, six comparison operators, and two choices for the target of the relationship (again, start or end) = 2 x 6 x 2 = 24 different terms.

One of my desires is to avoid complexity for implementers.  Twenty-four vocabulary terms seems to be too complex.  Do we really need to support >=, <= and != ?  After all, these are just the same as NOT <, NOT > and NOT =, and we could achieve NOT with the negationInd attribute in the RIM.  But using negationInd doesn't make it any simpler, and in fact, divides the problem up across two different attributes (negationInd and typeCode).  We already have existing vocabulary supporting >= and <=, because we have terms for OVERLAP which needs to deal with these operation.

So, I decided to look at it a different way.  In a time range, there are five discrete parts.  Before the start, the start, after the start and before the end, the end, and after the end.  We need to be able to talk about any of these five parts in sequence.  If we label the parts A, B, C, D and E, we need to be able to talk about:

  1. A
  2. AB
  3. ABC
  4. ABCD
  5. ABCDE
  6. B
  7. BC
  8. BCD
  9. BCDE
  10. C
  11. CD
  12. CDE
  13. D
  14. DE
  15. E
The range identified by ABCDE is "all time" and so can be dropped.  It is unnecessary because all acts are related to all other acts with respect to all time, and so this is a meaningless relationship.

I'm going to reorganize the remaining acts, and show how they relate to temporal comparisons in the following diagram:

Each of these defines a time range related to a target act.  We can indicate that the start of the source act occurs within each of these ranges, or the end of the source act occurs within them to define some useful relationships.  The names of these relationships would be something like SAS or EAS to represent Start after Start, or End after Start.  That's nice because those are already HL7 vocabulary terms.  So, now we can apply S and E to each of the above, and get 28 relationships.  Hmm, this is a dead end isn't it.  After all, didn't we want something smaller than 24?

It gets worse.  If you've been swift, you might also note that there are some tests where you want to test both start and end of the source act.  The existing OVERLAP, DURING and CONCURRENT are examples of temporal relationships that do this.  So now we are up to 31 or more.  Yech.

I'm sorry to report, it doesn't get better (at least yet).  After some more digging, I found a paper (which I should have looked for first).  James F. Allen published a paper in 1983 that reports that there are 213 = 8192 possible relationships that can be described between two definite intervals.  Dr. Thomas Alspaugh provides a great explanation of Allen's paper.  You should probably read that summary to understand the rest of this post.

Dr. Alspaugh explains that there are 13 basic relationships between two intervals.  These 13 basic relationships are distinct (meaning each can be distinguished from the other), exhaustive (because the relationship between any two intervals A and B can always be identified as following one of these patterns), and is purely qualitative.

Where things get interesting in Allen's algebra is when Alspaugh produces a table that shows what happens when you "compose" a relationship.   Composition of a relationship describes the how to compute the relationship r.s that between A and C, when A r B and B s C.   As it turns out, there are 27 "composite" relations when you perform composition over the original set of 13 basic relationships.

So, I looked over the 27 different relationships, and this is what I found:

  1. One of the resulting relationships (full), is true for all intervals, and so is not worth addressing.
  2. Ten of the relationships already existed in the HL7 ActRelationshipTemporallyPertains Value Set.
  3. Only one of the relationships in the HL7 ActRelationshipTemporallyPertains Value set (Ends After Start) doesn't appear in the list of 27 relationships generated through composition (its inverse doesn't appear in either place).
Then I went back to the original discussion to see whether I'd been able to, using this method, match the requirements.  There are two terms in NQF's Quality Data Model which aren't covered in the current vocabulary.

Starts before or during
A relationship in which the source act's effective time starts before the start of the target or starts during the target’s effective time. An Act is defined by HL7 as: “A record of something that is being done, has been done, can be done, or is intended or requested to be done.”

  • A pacemaker is present at any time starts before or during the measurement period: [Diagnosis active: pacemaker in situ] starts before or during [measurement period]
  • A condition [diagnosis] that starts before or during [measurement end date], that means the diagnosis occurred any time before the measurement end date including the possibility that the diagnosis was established on the measurement end date itself.

Ends before or during
A relationship in which the source act terminates before the target act terminates.

  • To state that intravenous anticoagulant medication is stopped before inpatient hospital discharge: [Medication administered: anticoagulant medication (route = IV)] ends before or during [Encounter: encounter inpatient]

The challenge with these two is that they alter the meaning of during in the use of the term for "Starts Before or During". According to HL7, DURING as a vocabulary term means wholly contained within the time period of the target.  So, Starts During would mean that the start time is bounded by the range of (target.start through target.end), using the non-inclusive forms of the boundaries.

What NQF did was redefine during so that an event (e.g., pacemaker present or diagnosis occured) would be considered to be in the measure period even if the event occured on 20121231.  Why?  Because, like just about everyone else (including me until Grahame corrected us all), we didn't know how to record the time boundary correctly.  Remember the proper way to bound a time expression lasting one year is [20120101, 20130101).  This means: Starting on January 1st of 2012 [inclusive], up to, but no including January 1st of the following year.

So, let's go back and fix the definition of Starts before or During so that it could be changed.  Now it is simply "Starts before End."  Surely that code is present?  Actually, it isn't.  And similarly, Ends before or during becomes Ends before End (EBE).

And so, the two codes we need to add to the HL7 Vocabulary to support everything that's been asked of for HQMF are SBE (Starts before End) and EBE (Ends before End).  Which puts us back to the original 12 operators that I started with in this post.  And the realization that there are 8180 more relationships that we cannot handle simply, and likely don't need to, and that there are quite a few different ways to look at temporal relationships.

It seems that we don't need to worry to much about the differences between < and <= after all.  Especially if we can readily control one of the boundaries to make sure it is open or closed as necessary.

There's probably a whole post in this on the proper handling of intervals in software in general.  But I'll save that for a later date.  I have a harmonization proposal to finish.

  -- Keith

Friday, June 29, 2012

Continuous Variable Measures - The Final Solution?

We are honing on on a model for how to perform Counting in HQMF and QueryHealth, which I discussed earlier this week.  I had been working on the Harmonization proposal for dealing with time relationships, but that got messy.  It turns out there is a whole algebra for dealing with intervals, and the more I looked at it, the closer I got to having my brain explode.

So, I'll go back to that after dinner (because I promised that tonight), and finish up this because it is also relevant for harmonization (and my brain will remain intact).

Essentially what we are proposing is a new <MeasureCriteria> element similar to <PopulationCriteria>, <NumeratorCriteria>, and <DenominatorCriteria>.  That element will define the SET of objects over which an aggregation computation is performed.  The computation will be defined within a <MeasureObservation> element that references the <MeasureCriteria> element.

My first crack at how <MeasureObservation> would look like was this:

  <id .../>
  <code code='aggregationFunction ' codeSystem='...'/>
  <sourceOf typeCode='DRIV'>
      <id .../>

Inside, the computation could use Simple Math or another expression language, and the expression could be based on a specific <measurePopulationCriteria> defined in the <PopulationCriteriaSection>.

As I looked at this though, I wondered if I really even needed to reference the <measurePopulationCriteria>, because when it is defined, it already has a <localVariableName>.  Would it not in fact be simpler to just say:

  <id .../>
  <code code='aggregationFunction' codeSystem='...'/>

In the HQMF, the local variable for the <measurePopulationCriteria> element is already defined earlier in the same document, and thus could be inferred from the document context.  Channeling Marc's annoyance with unnecessary XML (and my own I might add), I got rid of it.

So the remaining piece here was to define the aggregation functions allowed in this act.  The ones that I can come up with include COUNT, AVERAGE, SUM, MIN, MAX, MEDIAN and MODE.  Everything else can be computed from these.  In fact, AVERAGE is readily computable from SUM and COUNT, but it is done often enough to merit inclusion in the set.  Another issue here though is that we often want to compute several of these results, for example, the AVERAGE, and the range (MIN and MAX), and we might also want to compute STDEV and VARIANCE.  Unfortunately, I can only associate one <code> with the observation.

After a bit of digging around, what I realized was that what I was doing was aggregating, and applying one or more methods during the aggregation.  So now I have a slight variation on the previous model, where code is fixed to AGGREGATE (a new value that I now need to add to ActCode), and the clone name is changed to reflect what this has become.  I've added methodCode to indicate what aggregation methods are to be used, and you can repeat it to indicate that you want to use more than one (e.g., AVERAGE, STDEV and COUNT).

  <id .../>
  <code code='AGGREGATE' codeSystem='2.16.840.1.113883.5.4'/>
  <methodCode code='aggregationFunction' codeSystem='...'
    codeSystemName='ObservationMethodAggregate'/> (...)

Now the only thing left to do was define the value set for ObservationMethodAggregate (a value set I just made up to appear in ObservationMethod).  As I was going through this list, I realized that I either needed to define how standard deviation and variance are computed (over a population or a sample), or allow for both methods.  I figured it would be easier to include both, providing greater clarity about what was meant in each code.

CodePrint NameDefinition
COUNTCountCount of non-null values in the referenced set of values
SUMSumSum of non-null values in the referenced set of values
AVERAGEAverageAverage of non-null values in the referenced set of values
STDEV.SSample Standard DeviationStandard Deviation of the values in the referenced set of values, computed over a sample of the population.
VARIANCE.SSample VarianceVariance of the values in the referenced set of values, computed over a sample of the population.
STDEV.PPopulation Standard DeviationStandard Deviation of the values in the referenced set of values, computed over the population.
VARIANCE.PPopulation VarianceVariance of the values in the referenced set of values, computed over the population.
MINMinimaSmallest of all non-null values in the referenced set of values.
MAXMaximaLargest of all non-null values in the referenced set of values.
MEDIANMedianThe median of all non-null values in the referenced set of values.
MODEModeThe most common value of all non-null values in the referenced set of values.

Fortunately, if my memory serves, that leaves me with nothing more to do on this topic (given that I've already updated the R-MIM in the Visio Diagram to support this).

So, how does this address continuous variable measures.  Let's take a simple example:  Average ED visit time.  This is pretty straight-forward.

  1. Define a measure over encounters (recall that you need to specify this in the measureAttribute element of the QualityMeasureDocument).
  2. Create an encounterCriteria that selects only ED encounters in the dataCriteriaSection.  
  3. Now create a measurePopulationCriteria with the localVariableName EDVisits in the populationCriteriaSection.
  4. Finally, create the measureAggregateDefinition in the measureObservationsSection, and add this XML inside it: 
  5. <derivationExpr>EDVisits.effectiveTime.high - EDVisits.effectiveTime.low</derivationExpr>

We'll need to indicate that when AGGREGATE is computed, and no aggregation method is specified, that the implementation can determine what it does, and suggest that it at least compute the COUNT and SUM, but may produce other aggregate statistics.

Am I done with this?  Probably not, but hopefully enough to get us through the ballot.

OK, off to dinner and then back for some more stuff.

Thursday, June 28, 2012

SWBAT Get Their Darn Data

Some of you have already met my daughter, Abigail (aka, @amaltheafairy on Twitter).  For others, this will be your first introduction to her.  What follows is her first guest post on this blog.  The words are her own, with a little editing help from dad.

Once upon a time there was a girl whose father worked on healthcare standards. He told her allllll about the problems and ways he thought they could be fixed.

“People are having medical issues because they don’t know to ask for their records, and when they do, they don’t understand what they’re looking at!” Said her father. 
“Well, daddy, how are we supposed to know to do something if we’ve never been taught to do it? I couldn’t do anything without help until I figured it out by myself, and you taught ME to ask for my records. If people have never been taught, why not put it in a school class to teach them?”

That’s something like how it went.

As long as I can remember my dad’s always been teaching me about his job, and what he does, and what it means to be a “standards geek”. He gets to meet these amazing people and work with them. People like Regina Holliday and Dr. Farzad Mostashari. These amazing people and my father were called to a “SECRET WHITHOUSE MEETING” to discuss Meaningful Use Stage 3, problems with Stage 2, and the advances different companies are making to make it easier for people to access their records. At one point, we got to the question of why we are trying to make it easier for people to access their records, but not telling them they can?

It is a fair question. If people aren’t asking for them, and aren’t looking at them then, shouldn’t we work on making sure that they know how to look for them, and know how important understanding their records can be? With that understanding comes better knowledge on how to help themselves. Why not try to put this knowledge into our high school health and wellness curriculum, or better yet put it in a class of its own?

When we are taught something in school, it tends to be something we’ll use. We have a civics class to teach us how to be good citizens, and a class on child care and life science for parenting, all are useful classes for our future. We were never taught to ask for our records, we were never taught that knowing your records could change the way your doctor treated you. We need to know this before it becomes too late. For some, they couldn’t insure their family because they had a preexisting condition. No one told them, that they wouldn’t get insurance because they needed some treatment or medication as a teen or young adult. When were you taught that understanding your records could prevent that and many other situations like that? Even with this understanding, when were you told that a doctor could make mistakes you could catch? Such as, something copied down wrong in your record, something that didn’t happen or a surgery that never took place…

If such a class could be taught in the high schools or maybe even the junior highs and middle schools, the generation to come could learn this. We would take it home, this knowledge, to our parents, who could bring it into caring for our grandparents. It would start a chain reaction with us, tomorrow’s children. What we need to teach in our own schools are some of these major issues with healthcare. Give us the knowledge and the power and I know from experience we will want to act for our own good!

I’ve already discussed this with my dad, Motorcycle Guy, and I’ve even discussed it with my State Representative, Walter Timilty. I’ve told him my idea and I’ve told it to everyone at that “Secret Whitehouse Meeting”, and I’ve even asked some of my friends from school about what they would think of a class like that. So far almost all have thought it was a good idea.

A concern my friends had, like all typical middle-schoolers, soon high-schoolers, was “how hard would a class like that be,” and “how hard would the questions and tests be?” As I was thinking on that, it occurred to me that I don't have the slightest idea as to what the questions would be. Mainly it would be the students’ main rights under HIPAA, and what each part of their records meant, as well as what could be in each part of their records, but I don't know enough to dictate every question on a test or even what could be discussed in class each day.

In my state, the law states that there has to be a curriculum framework for each class.  Would we have to create a whole new curriculum for a class on something related to health and wellness or could we just add it to the health and wellness curriculum already in place?

The Massachusetts Comprehensive Health Curriculum Framework, already has what should be covered in 6th--12th grade.  I think these are important points:
12.7 Evaluate both the physical effectiveness and cost effectiveness of health care products.
12.12 Identify information needed to select and maintain relationships with health care providers to meet the needs of individuals and family members.
12.17 Describe the individual's responsibility to be a wise and informed consumer, including how to plan a budget that includes a spending and savings plan.
12.19 Identify procedures for making consumer complaints, such as determining if/when a complaint is warranted, gathering relevant information, and identifying the appropriate agencies to contact.
I rewrote them the way my teachers for my core subjects used to do it.  SWBAT means "Students will be able to".
12.7 SWBAT evaluate both physical effectiveness and cost effectiveness of health care products.
12.12 SWBAT identify information needed to select and maintain a relationship with health care providers to meet the needs of individuals and family members.
12.17 SWBAT describe the individual’s responsibility to be a wise and informed consumer, including how to plan a budget that includes a spending and savings plan
12.19 SWBAT identify procedures for making consumer complaints, such as determining if/when a complaint is warranted, gathering relevant information, and identifying the appropriate agencies to contact.
The standards for this curriculum could easily be used to include lessons on

  • Access to your health record, 
  • Understanding and reading your health record, 
  • Fixing any mistakes in your health record and 
  • The use of your health record. 

Everything we need is already in front of us. Why don't we give it a shot and try for a better health care experience!  That would be another Epic Win.

^.^ #SWBAT

Tuesday, June 26, 2012

Computing in HQMF and QueryHealth

This post begins to address continuous variable measures and similar kinds of computations in Query Health and HQMF.

Most of what HQMF does is allow you to specify how things are being counted, and as I mentioned yesterday, what you are counting is a function of your implementation model.  Change your implementation model (or enter it at a different point), and what you count can go from being patients to encounters, or even something else.

An HQMF counting diabetic patients who have an A1C result greater than 9% is similar to the SQL COUNT() function in a select statement with a complex join and criteria.

JOIN Conditions C on = 
JOIN Results R on =
WHERE C.Condition = 'Diabetes' 
AND R.type = 'HgA1C' 
AND R.value > 9
AND R.unit = '%'

Other measures need more than counting.  Suppose you wanted to compute a quality measure for the average number of days of stay for delivery of a newborn.  In order to compute that, you would need to find all inpatient encounters for delivery of a newborn, and then take the average over the lengthOfStayQuantity attribute of each of these encounters.  In SQL, this would look something like this:

SELECT AVG(E.lengthOfStayQuantity) FROM Patients P
JOIN Encounters E ON =
JOIN ReasonForVisit R ON E.eid = R.eid
WHERE R.reason = 'Delivery'

SQL provides the AVG, SUM, MIN, MAX and COUNT aggregate statistic functions.  In HQMF, to get to SUM(), what you do is create a <measureObservation> element that contains a <derivationExpr> element describing what should be accumulated, and references to the appropriate criteria elements from which the observation is computed.  I've used the criteria reference elements in the example below.  In the original HQMF this would just have been an encounter element.

<observation classCode="OBS" moodCode="DEF">
  <id root="b421c8a3-7949-11de-8a39-0800200c9a66"/>
<sourceOf typeCode="DRIV">
      <id root="b421c8a9-7949-11de-8a39-0800200c9a66"/>

In the example above, for Query Health and HQMF, we can completely drop the sourceOf element.  That's because the HQMF document supports definitions of local variables for all data criteria elements.  I'd also change the name from observation to something else with more meaningful name.  Perhaps accumulatedValue, with classCode fixed to OBS, and moodCode fixed to DEF.

  <id root='b421c8a3-7479-11de-8a39-0900200c9a66'/>

This is greatly simplified.  Now, if you want to do more than count things, you can do so.  One of the challenges that I tried to address in Simple Math is what kind of expression should go into derivationExpr. HQMF doesn't actually address the language syntax.  The HL7 preferred language for this kind of stuff is GELLO, but frankly, I have a hard time with GELLO.  I have to translate this expression into something that is executable in Java, C#, SQL or XQuery.  I don't have the luxury of being able to install a GELLO interpreter for that purpose (can you imagine trying to convince a data center team that it is OK to install a GELLO intepreter in a data center that has to have 5 9's availability?).  The same is true for any other complex language, what we really need is Simple Math.

The next issue that comes up is what if you want to sum things only if certain conditions are met, just as you want to count things where certain conditions are met.  To do that, you need to add a <precondition> element to the <accumulatedValue> element, and that needs to support the same kinds of preconditions that we count with. This lets me attach the precondition to the encounter that says "it must be an encounter for a newborn delivery."

So now, I could also compute the number of central line days and the number of central line infections for a time interval by computing the sum of central line days for each patient, and the number of separate central line associated infections for the patient for a one month period.  The results of the HQMF would give the necessary data to compute the CLABSI infection rate.

It seems obvious how to compute when there is one variable.  Even the case above is simply two different summations over a single patient related variable.  Moving up the scale, you might even want to compute standard deviation and variance.  That's also straightforward.  If you have COUNT(x), SUM(x) and SUM(x2), you can compute standard deviation and variance, by simply expanding the formula:  VAR(x) = SUM(x - AVG(x))/(n-1)

But as soon as you move into a case where you have multiple variables (necessary for dealing with regression statistics) it gets more challenging. The reason for that is because you aren't necessarily working in the same scale.  What would happen if you had two variables in the expression?

The short answer is that I haven't figured that out yet.  But I will, and when I do, I'll report on that as well.

Don't Panic

Sometime this morning (it appears to have started sometime between 6 and 7am ET based on logs), and for some obscure reason, visitors to this blog [including me] were told that it had been deleted.  When I went to log in, I got the same report, and my iPad also reported that it couldn't log into e-mail.  A few retries later, and everything seemed to work just fine.  I had assumed that someone attempted to hack into my e-mail account (and possibly even succeeded), but can find NO other evidence of that.  Even so, I changed passwords just to be certain.

My backup of the blog is about 3 months old, so after making sure everything was working again, I refreshed it just to be certain.  That really aught to be easier to automate (even if it only took 20 seconds to do manually).

Monday, June 25, 2012

Counting other things in QueryHealth and HQMF

In Query Health, we've been focused on counting patients who meet particular criteria.  But just as we are counting patients, we could also be counting encounters, providers, organizations, or just about anything else that you wanted to count to compute a quality measure.  The key to determining what you are counting depends upon your data model.

We bind what gets counted through the use of definition elements and reference elements in the dataCriteriaSection.  Each criteria element references something that is defined in a definition element.  But we don't really say a thing about how the data model itself is structured in the HQMF representation in the HQMF.

In the Query Health model, we start with a patient.  A patient has demographics (which is often how we select them for the initial patient population).  They can also have problems, medications, allergies, immunizations, diagnostic results, vital signs, procedures and encounters.  We might tie the record of a problem, medication, allergy, immunization, diagnostic result, vital sign or procedure back to an encounter.  And we might tie an encounter back to a provider.  So in that model, when you start from patient, you analyze the HQMF from that perspective, and that is what you wind up counting.

But, if you wanted to count encounters, you'd organize your data model differently.  Instead of starting with a patient, you'd start from the encounter.  Your "IPP" would not longer be a collection of patients, but instead, would be a collection of encounters.  You might select them based on what was recorded during that encounter (e.g., diagnosis), or what was done during that encounter (e.g., a specific procedure), or details about the encounter such as the type of setting (e.g., ambulatory, ED, inpatient). An encounter would have a set of problems, medications, immunizations, allergies, diagnostic results, vital signs, procedures and patients associated with it.

How you organize the model influences greatly how you wind up counting, but it doesn't require many changes to HQMF.  It may have been written with a particular model (the patient-centric one) in mind, but HQMF in practice doesn't assume any particular data model.  So what would we need to change?

To clarify things, we'd want to change the "Initial Patient Population" entry to become the "Initial Population" entry.  This is simply a clarification that what we are counting could be something other than patients.  The other thing we would want to do is add a classifier to the measure heading to indicate what it is we are counting.  The classifier serves two purposes:

  1. It identifies to the user what is actually being counted. 
  2. It indicates what is being counted so that the implementer can use the correct data model.
To add this classifier, we simply need a code to represent it, and a value set for the different kinds of things that could be counted.  What we are counting could be patients, providers, encounters, procedures, immunizations, lab tests, locations, organizations or just about any other kind of entity or event.  For most cases, I think we would readily use patients, and might use encounters, treatments (procedures and medications) or diagnostics (test results).  There might be cases where it might be interesting to compute quality measures that count providers, organizations, locations, or even devices.

The XML to express this would be pretty simple:
      <code code='COUNTS' codeSystem='2.16.840.1.113883.5.5'
                codeSystemName='HL7 Act Code'/>
      <value code='...' codeSystem='...'/>

It's pretty easy to imagine a quality measure for an encounter or procedure (and this example could be written either way):  Encounter population = all surgical encounters.  Denominator all surgical encounters on a patient older that 70.  Numerator, all surgical encounters where the patient was given a flu vaccination in the 3 months during or prior to the encounter.

You'll note that I didn't fill in the details for the <value> element.  That's because I don't know what they should be.  We could come up with a value set from SNOMED CT, or from HL7 Vocabularies (we might need to use several, because patients are roles, but encounters and procedures are acts).

With respect to quality measures for those other things, I could imagine cases where what you might want to count the number of times a certain test is used, compared to the number of times that test is positive.  That could tell you some interesting things about the utilization of that test, and it could be compared against other tests, or other uses of the test at other locations or regions.  It too could have a different model that it was executed against.  But this example is also amenable for computation using a patient- or encounter-based model.  In the encounter based model, the denominator would be encounters where that test was ordered.  The numerator would be the number of cases where the result of the test ordered came back positive.  So maybe we don't need a large vocabulary to express what we are counting.

There is another approach to this problem as well, which I'll go into more detail upon tomorrow.  If we expand on our use of measure observation, so that rather than just counting things that match, we accumulate the values of expressions in a <measureObservation> element, we could support a number of additional capabilities.  If you look at the SQL prototype I discussed back in November, you can see where the final output relies on the COUNT  function in the query.  Other aggregated statistics functions could also be used.

Friday, June 22, 2012

Summer is for Standards

Over the past two weeks we've heard about at least five different new S&I Framework projects:  Health eDecisions, Closed Loop Referrals (see Scenario 1), HREx, Auto BlueButton,  and Patient Generated Health data. Not all of these are yet "officially" announced projects on the S&I Framework site.

On top of that, there are a half dozen HL7 projects that are being worked on concurrently that support some of the existing projects (e.g., Query Health), and others which could support new projects.  This is a recent set of project scope statements that were recently approved by the Structure and Semantic Design Steering Division for the HL7 Structured Documents Workgroup (which had already approved them).

Some of these projects (such as HQMF Release 2) are already well under way.  I've been spending about 8 hours a week in meetings on them due to some extremely tight deadlines.  Just to give you an idea, anything like the Consolidated CDA revisions for patient assessments and QRDA Release 2, already in ballot, which might need to get into Meaningful Use stage 2 final rules, would needs to get published by the middle of next month, if it were to show up in a rule published at the end of August.  That early deadline is really good, because after that, I'm off to IHE meetings to finish up just a few other small things ;-)

HL7/IHE Health Story Implementation Guide Consolidation – additional templates for patient assessment data, at Project Insight # 728 for SDWG and TSC Tracker # 2305. The Patient Assessment Summary Work group (PAS WG), a work group within the ONC Standards and Interoperability (S&I) framework, identified a subset of data elements from patient assessment instruments for exchange in a summary document. This project will review and map the PAS WG identified clinically relevant elements to templates in the Consolidated CDA Templates DSTU. When existing templates are unavailable, or underspecified, this project will update, or add new templates to the Consolidated CDA Templates DSTU as specified by SDWG. The new CDA templates can be derived from the source instruments automatically. An appropriate document-level title, template, and document type code will be determined under this project. The project will be scoped to elements in the PAS SWG identified subset. The ballot will be scoped to these new and updated templates, and the SDWG agreed upon errata. The project will not introduce a review of content of existing document-level templates. The entire DSTU will not be re-balloted. Scope Statement is at

CDA Implementation Guide for Patient Assessments, at Project Insight ID# 381 and TSC Tracker # 2304 for SDWG. The scope of the project is to update the CDA Implementation Guide for Patient Assessments DSTU to conform to the new Consolidated CDA US Realm Header, and IG format. The DSTU will continue to include a universal header and body. Add guidance for communicating elements in the CMS Continuity Assessment Record and Evaluation (CARE) assessment tool. Scope statement is at

QDM-based Health Quality Measure Format (HQMF) Implementation Guide at Project Insight # 756 and TSC Tracker # 2302 for SDWG cosponsored by CDS. This project is intended to create a domain analysis model to describe the content required within EHRs to measure quality and performance and maintain safety standards.  The information model content will be derived from the Quality Data Model (QDM) established in a public consensus process by National Quality Forum (NQF) in the United States.  QDM is a model of information based on the needs of quality measure developers and clinical decision support rule creators. It has now been tested in the retooling of 113 existing quality measures into HQMF format and in a number of CDS rules in an eRecommendations project funded by the Agency for Healthcare Research and Quality (AHRQ). Scope statement is at

Health Quality Measure Format (HQMF) Specification, Release 2, at Project Insight ID#508 and TSC Tracker # 2302. The DSTU will be updated to address new requirements identified through testing/implementation. Scope statement is at

QRDA Draft Standard for Trial Use, Release 3, at Project Insight # 210 and TSC Tracker # 2301, for SDWG. This project will further enhance the QRDA Category I DSTU following updates to the HQMF specification to support alignment. Scope statement is at

Patient Authored Documents, for Structured Documents WG at PI# 900, and TSC Tracker # 2300. In the “era of patient empowerment”, we want to define a specification for patient-authored clinical documents.  Medical practices are looking for ways to allow patients to electronically complete certain tasks online such as filling out registration forms, health history forms, consenting to certain practice policies, and other types of clinical documents yet to be defined. As electronic document interchange increases, we see a growing need to communicate documents created by patients (including those needed by providers and/or those document types defined by patients). Often, this is done through a secure web interface controlled by the patient such as a patient portal or a personal health record. As more and more practices incorporate EMR  technology into their practice workflow, they want to be able to import patient provided structured information into their EMR’s. This is being driven by the need to meet Meaning Use 2 requirements for patient engagement as well as other needs to reduce manual processes managing patient-provided data. Scope statement is at

HL7/S&I Framework QRDA II/III Draft Standard for Trial Use, Release 3, for SDWG at PI# 896 and TSC tracker # 2299. This project will define and bring to ballot a set of specifications for reporting quality data representing query results. The queries themselves use the Health Quality Measure Format (HQMF). This work effort will include a U.S. Realm QRDA Category II (QDM-based) and Category III Implementation Guide to direct implementers on how to construct QRDA Category II and III instances in conformance with HITECH eMeasures to represent the query results obtained from the distributed queries. This work effort will further align with the work taking place in HL7 to update the Health Quality Measure Format (HQMF) Implementation Guide. In addition, this project will leverage and harmonize with similar activities within and outside HL7 to avoid duplication of existing efforts. Scope statement is at

Medicine of the Future

If you are my age, and reading this blog, you may remember the original Star Trek series (I watched it in reruns).  And in it you recall that away teams always carried powerful communication devices that could support communication around the planet.  Just flip it open and talk.  Or tap a device, speak the address of your connection and talk.  We have these now in cell phones and blue-tooth headsets.  Granted, Kirk's devices didn't require a complex cell-tower infrastructure to support them, but we still have the same capability in many populated places in the world.

McCoy had some pretty complex devices as well which supported remote monitoring of the patient built right into the beds of his sick bay.  And he had a hand-sized diagnostic scanning device which could tell him what was wrong with a patient relatively quickly (It was originally supposed to be a salt shaker).  We aren't there yet, but we are starting to see technology like this be developed. Today, in various cell-phone sized devices, we can perform ultrasounds, evaluate EKG results, record temperature, blood pressure, pulse, respiration rate, and O2 Saturation, even take and send images for evaluation.  There's even a prize for putting them all together into a single device with advance clinical decision support capabilities.

McCoy also had a hypospray which he used to inject drugs.  MIT has developed one of those.

Star Trek spent a lot of time looking at the DNA of patients.  Have you sent your DNA out to be looked at?

What I find interesting about this is that all of this technology was envisioned by science fiction writers in the mid- to late-60's, almost 50 years ago.  The speed of adoption seems to be increasing though.  

We've  seen the beginnings of bionic limbs, hearing and vision popularized by Steve Austin and Jamie Sommers in the mid- to late-70's.

And IBM's Watson is being heralded as the one of the greatest advances in AI today, especially in medicine.  But it's not yet ready to pass the Turing test. Even so, could we be on the verge of Star Trek Voyager's Holographic Doctor from the mid- to late-90s?  The entertainment industry has already started experimenting with holographic technology.

And for all you Doctor Who fans out there, you might wonder, what his sonic screwdriver has to do with medicine?  Check out what these Scottish Scientists think about the idea.

What advances from Science Fiction do you see on the horizon? 

Thursday, June 21, 2012

Extending CDA for QRDA

One of the ballot comments I had made on QRDA was with respect to the mechanism that was being used to identify value sets which were being used inside a template.  We resolved that by pre-adopting the valueSet and valueSetVersion attributes as extensions to CDA which could be used with a QRDA submission.

I published an XSLT Stylesheet a couple of years ago that shows how you can remove all extensions to validate a CDA document against the normative schema.  I've updated it below to support use of the xsi:type attribute, which is often needed in observation/value to indicate the type used

<?xml version="1.0" encoding="UTF-8"?>

  <xsl:template match='cda:*'>
      <xsl:apply-templates select='@*|@xsi:type'/>  
      <xsl:apply-templates select='cda:*'/>  

As part of our discussion, we also agreed to publish a revised CDA schema with QRDA that would include not just this, but also any CCDA extensions used in QRDA, to enable others to use these schemas to validate instances.

The Consolidated CDA defined the a number of extensions in the urn:hl7-org:sdtc namespace.  These extensions are the same as were defined in the HITSP C83 specification.  NIST provides a modified CDA Schema on their downloads page that supports many of these extensions (see the second link on that page).

ExtensionWhere AllowedPurpose
sdtc:raceCodeAfter patient/raceCodeThe raceCode extension allows for multiple races to be reported for a patient.
sdtc:idAt the top of relatedSubject/subjectThe id extension in the family history organizer on the related subject allows for unique identificiation of the family member(s).
sdtc:deceasedIndAfter subject/birthTimeThe deceasedInd extension (= "true" or "false") in the family history organizer on the related subject is used inside to indicate if a family member is deceased.
sdtc:deceasedTimeAfter subject/sdtc:deceasedIndThe deceasedTime extension in the family history organizer on the related subject allows for reporting the date and time a family member died.
sdtc:birthTimeAfter associatedPerson/nameThe <sdtc:birthTime> element allows for the birth date of any person to be recorded. The purpose of this extension is to allow the recording of the subscriber or member of a health plan in cases where the health plan eligibility system has different information on file than the provider does for the patient.
sdtc:dischargeDispositionCodeAfter encounter/priorityCodeThe sdtc:dischargeDispositionCode element allows the provider to record a discharge disposition in an encounter activity.

These extensions affect four types defined in the CDA Schema :
POCD_MT000040.Patient, POCD_MT000040.SubjectPerson, POCD_MT000040.Person and POCD_MT000040.Encounter.

To incorporate these extensions into the CDA Schema, define a separate XSD file (e.g., extensions.xsd) as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="" xmlns="urn:hl7-org:sdtc"
    xmlns:sdtc="urn:hl7-org:sdtc" targetNamespace="urn:hl7-org:sdtc"
    xmlns:cda="urn:hl7-org:v3"  elementFormDefault="qualified">
   schemaLocation="POCD_MT000040.xsd" namespace="urn:hl7-org:v3"/>
 <xs:element name="raceCode" type="cda:CE" />
 <xs:element name="id" type="cda:II"/>
 <xs:element name="deceasedInd" type="cda:BL" />
 <xs:element name="deceasedTime" type="cda:TS" />
 <xs:element name="birthTime" type="cda:TS" />
 <xs:element name="dischargeDispositionCode" type="cda:CE"/>

In the CDA delivered POCD_MT000040.xsd, import the extension.xsd file in the appropriate place in the schema by adding this line after all other includes:
<xs:include schemaLocation="../../processable/coreschemas/datatypes.xsd"/>
<xs:include schemaLocation="../../processable/coreschemas/voc.xsd"/>
<xs:include schemaLocation="../../processable/coreschemas/NarrativeBlock.xsd"/>
<xs:import namespace="urn:hl7-org:sdtc" schemaLocation="extensions.xsd" />

Then add references to the extension elements in their appropriate locations.  For example, to support raceCode, make the following change to the POCD_MT000040.Patient type:

  <xs:element name="raceCode" type="CE" minOccurs="0"/>
  <xs:element ref="sdtc:raceCode" xmlns:sdtc="urn:hl7-org:sdtc" minOccurs="0"
      maxOccurs="unbounded" />

Similarly, for POCD_MT000040.SubjectPerson:

  <xs:element name="realmCode" type="CS" minOccurs="0" maxOccurs="unbounded"/>
  <xs:element name="typeId" type="POCD_MT000040.InfrastructureRoot.typeId" minOccurs="0"/>
  <xs:element name="templateId" type="II" minOccurs="0" maxOccurs="unbounded"/>
  <xs:element ref="sdtc:id" xmlns:sdtc="urn:hl7-org:sdtc" minOccurs="0"
      maxOccurs="unbounded" />
  <xs:element name="name" type="PN" minOccurs="0" maxOccurs="unbounded"/>
  <xs:element name="administrativeGenderCode" type="CE" minOccurs="0"/>
  <xs:element name="birthTime" type="TS" minOccurs="0"/>
  <xs:element ref="sdtc:deceasedInd" xmlns:sdtc="urn:hl7-org:sdtc" minOccurs="0"
      maxOccurs="1" />
  <xs:element ref="sdtc:deceasedTime" xmlns:sdtc="urn:hl7-org:sdtc" minOccurs="1"
      maxOccurs="1" />

I'll leave the remaining two as an exercise for the reader.  

To support the two extension attributes requires a quite a bit more work.  The challenge is that  they need to be used inside datatypes-base.xsd, and also use the simple types defined therein.  That requires a bit of re-factoring to clean it up.  What I expect will be needed is to split datatypes-based.xsd into two parts: datatypes-simple.xsd which contains the simple types we want to reuse, and datatypes-base.xsd which contains the complex types that we want to modify to support new attributes.

Tuesday, June 19, 2012

Convergence: CEDD, CIMI, IHE, FHIR, hData, HL7, mHealh and ONC

How does it all fit together? If you've been following CIMI, HL7's CDA Release 3FHIR, and mHealth Workgroup initiative, ONC's many S&I Framework projects, IHE's mHealth profile and all the other standards activity going on, surely you've run into the problem shown below:

The volume of the information stream coming out of these projects is huge.  Keeping up with all of these projects requires a staff and budget that is much bigger than I have available to me, so I haven't been to all of the meetings, attended all of the calls, or paid attention to all that is going on. But I have managed to understand enough about what is going on to sort of figure out how some of it COULD fit together.

Knowing the history behind things helps.  The S&I Framework CEDD is based in part on the HITSP C154 Data Elements, with a slightly different twist.  That set of data elements comes from various CDA  documents created by HITSP and based on HL7 CCD, and IHE PCC profiles.  Coming out of that same body of work is the CDA Consolidation Guide.  And a good bit of it hearkens back to the HL7 Claims Attachments.

The CIMI work is informed by work in HL7 on Detailed Clinical Models, as well as work in ISO, and in OpenEHR.  Those communities have cross-pollinated pretty well, although you'll find some observers indicate otherwise.  A lot of the CIMI work will, I expect, also be informed by the Intermountain Clinical Element Models, which show a marked influence from CCD and HL7 Patient Care Models.

FHIR will be creating resources for many of the very things that CIMI will be creating "detailed clinical models" for.  What FHIR doesn't cover directly can be covered by it's built-in extension mechanism.  And unlike CIMI and an attention to all the detail, FHIR will only be addressing the most commonly needed or implemented data elements.

On the CDA Release 3 front, Structured Documents has moved away from defining clinical content, in deference to allowing clinically related workgroups in HL7 to use their own models in RIM-based healthcare statements.  They are also moving towards XHTML in content, something which FHIR has already adopted.  An FHIR has a resource aggregation model that supports documents as well as messages and services.  The actual XML for CDA Release 3 is defined by the HL7 XML ITS, but it could be readily transformed to the FHIR format if the R3 Document was sufficiently well-defined.  I expect the most significant efforts of Structured Documents as related to FHIR will be on defining the "document" resource.

On the transport side, we have the IHE mHealth Profile based in part on, and being harmonized with hData.  This profile builds on the IHE XDS Metadata which is presently part of Direct and Exchange.  ONC's RHEx Project also seems to be building from hData.  And finally, FHIR has a RESTful protocol which is also being harmonized with hData.

Fitting it all Together
So we have three streams in which convergence could occur.  On the content side, is how we model clinical information.  CIMI and OpenEHR are building models for that that go into great detail, probably much more detail than many systems know how to deal with today.  This can readily fit into FHIR over time, and is largely compatible with CDA Consolidation efforts given the great deal of cross pollination and common ancestry that has appeared.  I see CDA Consolidation templates being dominant in the near and mid-range future, with movement towards FHIR Resources over the mid- to long-term.

On the document side, it isn't clear whether CDA or FHIR will win out.  I suspect that FHIR will become dominant in the messaging and service space first, before eclipsing CDA in the longer term.

Finally, the RESTful transport will become some variant of hData, with the document model possibly evolving from the IHE mHealth profile and XDS Metadata (being restated a bit more RESTfully).

It's a way it COULD happen.  Your mileage may vary.  However, if we do manage to converge these streams, it could be very powerful.

Monday, June 18, 2012

IHE Radiology Technical Framework Supplements Published for Trial Implementation

Integrating the Healthcare Enterprise

IHE Radiology Technical Framework Supplements Published for Trial Implementation

The IHE Radiology Technical Committee has published the following supplements to the IHE Radiology Technical Framework for Trial Implementation as of June 15, 2012:
  • Cross-Enterprise Document Reliable Interchange of Images (XDR-I)
  • Import Reconciliation Workflow (IRWF.b)
  • Post-Acquisition Workflow (PAWF) 
These profiles will be available for testing at subsequent IHE Connectathons. The documents are available for download at Comments should be submitted at
Copyright © 2012 Integrating the Healthcare Enterprise, All rights reserved.

Friday, June 15, 2012

Simple Math - A Language for Expressions

One of the ongoing issues in Query Health and in HL7 in general is what language should be used in expressions.  Should it be JavaScript, GELLO, XPath or perhaps even an XML representation.  My answer is none of the above.

Most of the execution environments that Query Health can work in already support computation.  I don't want to introduce a new language interpreter unnecessarily.  I can imagine the discussion I'd have with a few data center folks explaining that I need them to modify their SQL environment to include a JavaScript interpreter, for example.  And GELLO, while it is a standard, lacks implementations.  Other representations have similar challenges.  Finally, the idea of using XML to represent simple mathematical expressions that people are already familiar with bugs me.  Yes, I do program in XML (XSLT to be specific).  I can tell you personally, that I hate the additional noise provided by that extra syntax.  It makes code really hard to read.

For Query Health, I can see a few cases where we need some fairly simple arithmetic computations.  To move beyond counting and support higher statistical functions we would need addition, subtraction, multiplication, division and exponentiation.  We probably don't need transcendental functions, but there are some operations for which we do need some simple functions over one or two variables.

What I really want is a simple mathematical expression language that I can easily implement in programming environments as diverse as SQL, XQuery, XSLT/XPath, C, C#, C++, Java, JavaScript, Perl, Ruby and any other programming language you can think of.  It needs to be simple enough that one could perform a series of search and replace operations to turn it into executable code.  Fortunately, most programming languages today share quite a bit of common syntax when it comes to expressions.

So, I decided to create a small language, I'm going to call simple math.  The point of simple math is to make it easy to write math expressions in a way that can be translated into a variety of execution languages.  Simple math is NOT designed to be computed by itself (although it could be).  It is designed to be transformed into something that can be executed in a variety of programming environments.

From an implementation perspective, the following are my requirements for "Simple Math":
  1. Definition of variables that can be bound to an implementation specific object (a database table, or class instance).  
  2. Definitions of variables that can be bound to an implementation specific object containing fixed (constant) values.
  3. The ability to call a function to perform some operation.
  4. Simple Arithmetic (addition, subtraction, multiplication, division, exponentiation)
  5. Parenthesis
  6. No side effects
  7. Easily translatable to executable code using simple search and replace operations.
Having worked in over a dozen programming languages, there are some common capabilities across most that could be simply reused.

So far, I'm just doing math, so all I need for literals are the usual representations of numbers.  There are two basic types:  Integer and real in Simple Math.  Integers start with a digit and are followed by consecutive digits, and may be preceded by a - sign.  Real numbers are represented as integers (minimally a single digit including 0) that are followed by an optional decimal point and decimal part (minimally .0) and an optional exponent part separated by the letter e (or E) and a required sign and decimal number indicating the power of 10 to which the number is raised.

Implementations must support at least IEEE single precision arithmetic on real numbers, and follow the IEEE rules for computation.

The operators for addition +, subtraction -, multiplication *, and division /, as well as parenthesis are pretty commonly used.  Some languages have operators for exponentiation, and others use function calls.  The languages that have these operators also have a function call notation, so I won't try to choose an operator for exponentiation.

The integer division and modulus operators are available in some programming languages, but do not have a consistent syntax.  Since they can be supported by function calls, I'd skip picking operators for them.

Parenthesis () are commonly understood when used to change operator precedence across almost all programming languages.

We don't need array operators for what I'm calling "simple computation".  I could be convinced otherwise, but I think we can skip this for most cases.  If we do get into arrays, we'd also need to understand what the index is for the first item, which varies by language.  Skipping arrays also avoids that mess.

We do, I think, need to have an operator for member access to an object, and most languages already support the . as the member access operator.  We don't need to get into differences between pointers, references and values, because as your recall, one of the requirements is to be side effect free.

I think we can skip the comparators, but again, I could possibly be convinced otherwise with a good use case for them.  They usually have a boolean result (but can be otherwise in the presence of exceptional values like NaN, Infinity or NULL).  And given that they have a boolean result, at least in the QH context, we can use existing boolean and range selection capabilities in the model to achieve the same result as comparison operators in an expression language.

Identifiers for variables and constants is also not too tricky (until you get to SQL). Most programming languages require an initial letter, and can be followed by any number of letters and digits.  In addition to the usual letters, most also allow an underscore or other punctuation characters in identifiers, but these characters can also have special meaning in other contexts.  For example, Java allows $ and _, C only allows _.  XML allows the _ and the : in names, but : has special meaning in many cases (as a namespace delimiter for a qualified name).  In SQL, it gets a bit more complicated, because the _ can have special meaning at the beginning of a name, and case could be significant or not in the usual transformation.  It is possible to "quote" identifiers, so that case significance can be preserved, and it is also possible to create a regular expression to support translation of identifiers into quoted identifiers.

One of the challenges of identifiers is avoiding reserved names in the various programming languages.  In SQL, if we "quote" the identifiers, then this is no longer a problem.  Looking across C, C++, C#, Java, JavaScript, Perl, Ruby, SQL, and VB I came up with a list of about 300 distinct keywords (ignoring case).  Then I realized that making a list of prohibited keywords would likely not work because some language that I hadn't considered wouldn't be supported.  The simple answer then is to ensure that there is some transformation of identifiers to a non-keyword form that works in general.  An example would be the arbitrary rule that no "simple math" identifier be permitted to end in a double underscore.  One could then append a double underscore to every identifier that matched a keyword in your chosen implementation language, and be sure that it would not collide with another identifier.

Function calls provide some of the more complicate arithmetic, including min/max functions, floor/ceiling/rounding, computing powers and logarithms, modular arithmetic, et cetera.  I like the JavaScript Math object as a basic starting point, but I think I'd limit min/max to the two argument forms.  I don't know that we need the trignometric functions for the kind of things we need to compute, but again, I could be convinced otherwise.  To that, I'd add a div() and mod() function to support modulo arithmetic.

These basic math functions would be preceded by an identifier (Math) and a dot . to indicate that they come from the basic math library.

There, now I have the basic idea behind Simple Math written down.  Now to standardize it.  Anyone looking to create a new standard?   I'm going to be speaking at OMG's Healthcare Information Day in Cambridge next week, maybe I can get them to take it on.

NQF Quality Data Model Update June 2012 Comment Period now open!

This appeared in my inbox. NQF is announcing (with very little fan-fair) the opening of the public comment period on revisions to the Quality Data Model from now through Mid-July.


   Thank you for your interest in the Quality Data Model.  The National Quality Forum is pleased to announce that the Quality Data Model Update June 2012 and companion Style Guide are available for comment on the National Quality Forum website from June 15th to July 16th.

Click here to be taken to the NQF website and QDM Update June 2012 commenting tool.

IHE Patient Care Device Technical Framework Supplements Published for Public Comment

Integrating the Healthcare Enterprise
IHE Patient Care Device Technical Framework Supplements Published for Public Comment

The IHE Patient Care Device Technical Committee has published the following supplements to the IHE Patient Care Device Technical Framework for public comment in the period from June 14 to July 14, 2012:
  • Asynchronous Data Query (ADQ)
  • Pulse Oximetry Integration (POI)
The documents are available for download at Comments submitted by July 14, 2012 will be considered by the Patient Care Device Technical Committee in developing the trial implementation version of the supplements. Comments can be submitted at