Thursday, December 29, 2011

Is XML Schema Worth it?

Wes Rishel asks whether XML Schema is worth it in a post yesterday.  Then he goes on to complain about its failings as a validation technology, and then segues to JSON.

There really are three different problems that Wes is discussing.  The first problem to address is whether data can be described in a way that can be easily parsed.  The second problem is whether the data that has just been parsed is properly structured.  The last problem is whether the structured communcation makes sense:

  1. How easy is this stuff to parse?
  2. Is it structured right?
  3. Does it mean something for my business?

The JSON/XML debate addresses the first question.  JSON is certainly easier to parse than XML. But there are a lot more tools for dealing with parsed XML afterwards than there are for JSON (at this point in time, that will change in the future).

Structure reverts back to Wes's first question, has XML Schema been worth it? I'd have to say that it has been.  From an industry perspective, there really is no denying that XML Schema provides a great deal of value across the IT domain, and without it, there's a lot of stuff we wouldn't have been able to do.  It hasn't been the simplest technology to work with.  But from Schema came web services (of the sort used within enterprises), information models, mappings from databases to XML and vise versa, and a whole host of other cool stuff.  Schema hasn't solved every problem that exists; as a specification language, it has its limits.  XML Schema is limited to creating and parsing structures that can be handled without look-ahead.  It makes it pretty easy to create context-free languages for communicating between systems.

On the JSON front, there was an attempt to create a JSON Schema language, but it seems to have died on the vine.  Unfortunately, the "JSON" crowd sees little need for schema, because JSON, RESTful and all the other Web 2.0 parts are intuitively easy to use and therefore validation isn't needed (that really is an oversimplification of the case).  I could spend a whole post on that topic (and probably will in the near future).

Business Meaning
Unfortunately, many problems require context-sensitive validation, which you cannot get from a simple context-free grammar production.  Co-occurrence constraints (like this code here implies that kind of code in that structure over there), are examples that introduce context-sensitivity into validation problems.  What XML Schema cannot do by itself can be assisted by languages like Schematron, which can do other things like co-occurrence very easily.  Schematron is a very easy to use XPath based validation language, but it is miserable at other tasks, like creating easily understood structures.  Context sensitive validation usually addresses things like business rules (this code isn't allowed to be used with that one), otherwise known as "edits" to coders.

In summary
In the end, to deal with these validation problems, it doesn't matter whether your parenthesis are shaped like this <> (XML) or like this { } (JSON) or even like this () (LISP).  Eventually, you WILL want to validate it (even if it is in JSON).

Tools like XML Schema can aid with that validation, but they don't and never really will solve the entire problem. It doesn't really matter what you do, or what tool you work with, because that last problem, addressing validation at the business level, is not a technology problem.  Business rules are created by people, and they defy logic and software algorithms.

-- Keith

Tuesday, December 27, 2011

Element Order Schema Important Not IS XML

A discussion on the IHE XDS Implementors Google Group spawned this question:

Why is order important in XML Schemas in cases where the order really doesn't matter with respect to data structuring requirements.  After all, the real issue is just that you have some number of child elements of a particular type.  Why should order matter?

There are a couple of answers to this question.

First is simply that the order is important when the schema says it is.  There are cases where the order of a collection of items has meaning.  This usually occurs in narrative.  Language is quite sensitive to order -- at least in "proper" construction, but as my wife often notes in our communication, "Order word important not is."

It does make sense to put the table header element (<thead>) before the body of the table (<tbody>), and it eases processing (it also simplifies table formatting to put the <tfoot> before the <tbody>).  There are also cases where order really doesn't matter.  Compare for example, dates in European locale to those in the US locale.  Today can be encoded as either <month>12</month> <day>27</day> <year>2011</year>, or <day>27</day> <month>12</month> <year>2011</year> or even <year>2011</year> <month>12</month> <day>27</day> without any loss of meaning.

In SGML DTD's there are three different operators for content models:

  • The comma (,) created lists (xsd:sequence in XML Schema).
  • The vertical bar (|) created choices (xsd:choice in XML Schema)
  • And the ampersand (&) created conjunctions where all elements needed to be present in any order (xsd:all in XML schema.

The ampersand operator was not actually supported by the XML DTD content specification.

Why would you make order important when there is no other requirement for it to be so?

Another reason why order is important is that it makes parsing XML easier to do.  Most XML Schema constructs can be parsed quite simply without any look-ahead using finite automata.  While the "xsd:all" construct can be readily converted to a data structure that can support parsing, you cannot use a finite automaton indiscriminately.  The number of states needed to support the "xsd:all" construct is on the order N! where N is the number of particles in the list of elements allowed. For example, in the date example given above: the first element could be year, month or day.  After that, there are two ways left to choose the next element, and then only one to choose the last.  See the list below.

  1. <Year>
    1. Y M D
    2. Y D M
  2. <Month>
    1. M Y D
    2. M D Y
  3. <Day>
    1. D Y M
    2. D M Y

XML (and XML Schema) is designed to be parsed and validated without using look-ahead because SGML (its predecessor) had the same constraint.  So parsers that deal with "xsd:all" typically keep a list of the particles, and do the validation that all of them were used no more than once afterwards.

Even so, it's much simpler to create a parser that doesn't need to worry about this sort of stuff.  This is why the & content model construct does not appear in XML 1.0 DTD content model, and was only reintroduced with XML Schema.

Another reason why order is important has to do with how elements are extended in XML Schema.  An complex type can be defined that extends another complex type by appending elements to the end.  This makes it easy for the parser to figure out what goes where.  Essentially what it does is create an xsd:sequence containing the content-model of the base type followed by the content model of the new type.  Which means that sequences extend naturally (because a sequence of two sequences is the same as the one sequence with all the particles of the two sequences put together in order), but xsd:all groups do not (becuase a sequence of two xsd:all groups is not the same as one xsd:all group containing the particles of the two).

Now, a brief note on how to create extensible Schemas.  The trick is to use wild cards.  You will typically have a complex type definition for the content of an item, and that will contain some sort of group (usually a sequence).

<xsd:complexType name="extendableElement">
      <xsd:element name="foo" type="fooType"/>
      <xsd:element name="bar" type="barType"/>
      <xsd:element type="xsd:any" minOccurs='0' maxOccurs='unbounded' />

What the wildcard at the end does is allow any element to be included at the end of your sequence, or any attribute to be added to the extendable element.  You could include namespace='##other' to say that the element (or attribute) has to be from a namespace other than your schema's target namespace.  This is in fact what I proposed as being the best way to extend HL7 V3 XML these days.

And so, now you know why order is important in most XML Schemas, even when it is not.

Monday, December 26, 2011

The one trend in blogging I really hate

It's the fascination with titles that have numbers in them. It started when that idiot posted something that wound up widely retweeted (and repeatedly blogged upon) that said you should talk about six things that..., or the top five ..., or your favorite three ... I cannot seem to get away from social media articles on Flipboard or Zite that start that way, so I hardly use those tools anymore. Fortunately, there are a lot of other real content providers out there who havent't gotten the message yet, so at least my science and some of my tech feeds have better leads. My one admonition to these headline writers is to please remember that your audience is intelligent, even if the SEO engines are stupid zombies.

Thursday, December 22, 2011

Happy Holidays HL7 V3 Style

<act moodCode='EVN' classCode='CLUSTER' xmlns="urn:org-hl7:v3"
  <observation moodCode='GOL' classCode='OBS'>
    <code code='365107007' codeSystem='2.16.840.1.113883.6.96'/>
    <effectiveTime value='20111225'/>
    <value xsi:type='CD' code='286582007' codeSystem='2.16.840.1.113883.6.96'/>
    <repeatNumber value='1'/>

  <observation moodCode='GOLclassCode='OBS'>
    <code code='365107007' codeSystem='2.16.840.1.113883.6.96'/>
    <effectiveTime value='20111225'/>
    <value xsi:type='CD' code='286582007' codeSystem='2.16.840.1.113883.6.96'/>
    <repeatNumber value='2'/>
  <observation moodCode='GOLclassCode='OBS'>
    <code code='365107007' codeSystem='2.16.840.1.113883.6.96'/>
    <effectiveTime value='20120101'/>
    <value xsi:type='CD' code='112080002' codeSystem='2.16.840.1.113883.6.96'/>
  <participant typeCode='PRF'>
     <telecom value=''/>
     <person classCode='PSN'>
     </person>   </participantRole>
  <participant typeCode='SBJ'>
    <participantRole classCode='MBR'>
      <person classCode='PSN'><name>You</name></person>

If you don't get it, check the comments for the answer.

Wednesday, December 21, 2011

Top 10 HealthIT Standards Efforts of 2011

This is my list of the top 10 standards efforts of 2011 that will have an impact on Healthcare IT in the coming years.  Several of these efforts are still works in progress and bear watching as they complete in 2012.  Others should bear fruit in 2012.  I've put them in reverse order by importance as I see it, which is not necessarily how I expect anyone else to view these:

10.  CDA Templates for Imaging
HL7, DICOM and IHE are all working together on developing CDA Templates for documenting imaging procedures.  I don't really know what the initiative is called, but I know it started in DICOM.  Even though I still claim to know nothing about DICOM, it's on my list because I think having a consistent way to document clinical encounters is extremely important.

9. IHE Reconciliation of Problems, Medications and Allergies
This is an IHE Profile for which I was the principal editor.  It describes what a system needs to consider when reconciling information from multiple information sources, and how it can document the fact that information has been reconciled, and from which sources.

8. Clinical Information Modeling Initiative
I wrote a post about this a couple of days ago.  This initiative was first discussed by the HL7 Fresh Look task force.  The idea is that they will be developing a set of detailed clinical information models.  It includes leaders in clinical modeling efforts from ISO, HL7 and openEHR.

7. ONC S&I Framework Laboratory Reporting Initiative
Finally, we'll have a single guide for reporting laboratory results. It's still being reconciled as far as I know (it got over 700 comments, which means that there is a great deal of interest).

6. HL7 Virtual Medical Record (VMR)
The VMR is a concept that has been discussed for over a decade in clinical decision support circles.  Several folks in HL7 have finally gotten together and published a proposal for what it would really look like.

5. ONC S&I Framework Query Health
If you read this blog, you know as much as I do about Query Health.  It's still a work in progress, but could have huge impact on the "learning healthcare system". The basic idea is about how to send the query to the data.

4. Fast Healthcare Interoperability Resources (Formerly: Resources for Health)
Grahame Grieve's response to the HL7 Fresh Look task force is being seriously (if slowly) examined by HL7.  It's definately a fresh look.

3. HL7 CDA Consolidation Guide
The CDA Consolidation guide is an HL7 project, but it was funded in part by the ONC S&I Framework initiative, and included formal participation from IHE.  The guide is now published (check the bottom of the page at the link above).  You can look forward to a January post from me on how it differs from the current HITSP C32.

2. IHE Cross Enterprise Document Workflow
This is a novel way of addressing ad-hoc workflow, which is endemic in medicine.  It solves the problem  of tracking what has been done for a patient.  If you are thinking about forming an ACO, you should take a look at this one, because it supports just the kind of service tracking that ACO's need in order to ensure their patients are receiving quality care.

1. HTML5 and Microdata
These promise not only to revolutionize the next generation of the web, but also healthcare if I get my way.  I want to use HTML5 to represent clinical documents, and microdata to represent the machine readable clinical content.

Tuesday, December 20, 2011

Chief Standards Geek

In Healthcare IT News yesterday was an article on 6 new HIT Positions in 2012.  They missed one that should be on the list, and that is the "Chief Standards Executive".

What is a CSE, and what does he or she do?
  1. Determines what HIT Standards are important to the business, and which ones are not (usually with the help of geeks).
  2. Decides upon appropriate organizational policy with respect to the development of the standard, or  use of the standard.
  3. Influence federal and state policy to the degree possible as it relates to HIT Standards.
  4. Assigns appropriate resources to participate in, learn about, and/or implement HIT Standards.
  5. Influences SDO organizations with respect to the development of HIT standards, with respect to appropriateness, industry need, et cetera.
Depending on the organization, this can be a full or part-time position, and can be combined with other roles.  Sometimes a CEO, CIO, CMIO or CMO will take on the role of CSE as part of their duties.

The CSE need not be expert in the Healthcare IT standards per se, but should have access to someone who is.  There role is not necessarily to develop the standards, but rather to know enough about them execute on the tasks above.

This needs to be an executive level position because the CSE needs to be someone who can commit an organization to a strategic path, and to assign resources to execute on their commitments.

I'm not a CSE (I'm definately not an executive); I'm a standards geek.  The CSE needs Standards Geeks to help them with these goals, because most executives aren't geeks.

Who is your CSE? 

Monday, December 19, 2011

ATNA Transport Layer Security for IHE Connectathons

Every year at this time the number of e-mails I see on IHE lists in a single week about TLS increases ten- or even a hundred-fold.  The challenge apparently is in configuring clients and servers to support TLS.  So I thought a brief overview of how to do this was in order.

TLS is an Internet RFC (now at version 1.2) describing how to secure HTTP connections between clients and servers.  It addresses two key issues:  Authentication and Encryption.

TLS is an enhancement of SSL, previously created by Netscape for its browser.  You can see the progression of it through these different versions:

Use of TLS 1.0 is what is specified in the IHE ATNA Profile, and what is most widely available in computer products and platforms.  TLS 1.0 with the necessary encryption protocol (AES 128) is supported on:
  • Windows Version 7 and later products, as well as Windows Server 2003 (SP2) and later.  If you have Windows XP, and are using .Net, you will need to upgrade [and no, Vista is not considered to be an upgrade ;-)]. 
  • Java Release 1.4 and greater.
  • IOS 4 and 5 (I'm told that IOS uses openSSL)
The authentication layer is based upon X.509 certificates.  Systems that want to prove that they are who they say they are use the private key associated with their certificate to encrypt data that can only be known to both parties.  The ability of the receiver of the encrypted data to decrypt that data using the public key allows that receiver to be certain that the sender knows the encryption key (and is therefore highly likely to be the owner of the certificate because only the owner of the certificate should have that key).

Building certificates and creating stores is a common challenge for users.  Fortunately, for those testing in connectathon, this is often done for you.  When you need to deploy, I've put together some instructions for how to do this using freely available tools from openSSL.

Encryption between the two communicating systems is accomplished by agreeing on an encryption key, which is shared between the client and server in a block of information that is encrypted using a public key encryption method (very secure, but slow).  That encryption key is then used in a faster cipher mechanism to encrypt the communications.

Debugging The Protocol
Microsoft has a great article that describes how the TLS protocol works in detail.  You might also consider reading the IHE ATNA FAQ if you haven't already. The key to getting TLS right for connectathon is correct configuration.  You want to support ONLY TLS Version 1.0, only using AES-128 encryption, and if your system is the server, to require client certificates.  Configuring your system to user other versions of TLS or SSL, or other encryption suites is a common way to fail.  Another source of failure is incorrect set up of your trust and key stores.

First, some important definitions:
A certificate is a blob of data that identifies a server or client.  It's structure is defined by the X.509 standand.  It contains some information identifying the system, and includes the public encryption key associated with that system.  A certificate is signed by a signer which may be the original system (a self-signed certificate), or another entity.
Certificate Chain
A certificate chain refers to the collection of certificates which identifies the certificate for a system, and includes the certificate for the signer of that certificate, and its signer, recursively until you reach a "root" level certificate.  When a certificate is sent from one system to another, it is usually sent with the chain of signers.
Key Store
A key store is a place where a systems certificate and private encryption key are stored.  It is used by clients and servers to locate the certificate used to identify the system, and to obtain the private encryption key to encrypt/decrypt certain messages in the TLS protocol (usually the shared session key).  If your key store doesn't contain your encryption key, the certificates in it won't help you.
Trust Store
A trust store is the place where a system stores certificates of those systems that it trusts, or the certificates of the signers of certificates that it trusts (and their signers)
Client Hello
The first thing that happens in a TLS communication is that the client sends a client hello message to the server.  This message should be a TLSv1 Client hello, and no other version.  This message contains the highest version of SSL/TLS supported by the client, a seed containing some random data to initialize encryption keys, and a set of flags indicating what encryption algorithms are supported.

If the client hello indicates that the client only supports SSL Version 2.0 or 3.0, you've already got a problem.  It means that the client hasn't been configured to use TLS Version 1.0 (or higher).  If your system is the client, you need to fix it.  If your system is the server, and you are rejecting the communications, you are fine.  If you aren't rejecting it, you might be securing the channel, but you aren't conforming the IHE ATNA requirement of using TLS 1.0.

Now, because IHE ATNA profile specifies that the channel encryption must support TLS 1.0 you have to demonstrate that support.  That doesn't mean that you cannot support a higher level of the protocol.  So, the client hello message could indicate that the client supports TLS 1.1 or 1.2.  A properly coded TLS 1.0 implementation would degrade gracefully to TLS 1.0 upon receipt of a higher version client  hello; but not every implementation is properly constructed.  That means that when you configure for connecthathon testing, you really should limit your system to TLS 1.0 as a client or as a server.  In the production environment, if you want to bump it up to TLS 1.1 or 1.2, that's fine, but in the testing environment, you want to ensure maximum opportunity for success.  So, configure your system for TLS 1.0 (and only that).

Encryption Algorithms
If the client hello doesn't indicate that it supports AES 128 bit encryption with SHA-1 hash, you also have a problem, because it means that the client does NOT support the IHE ATNA required encryption method.  ATNA requires the use of AES-128 with SHA-1.  If your system supports other encryption methods, that's great, but for connectathon, turn on AES-128, and turn everything else off.  Systems supporting SSL and TLS negotiate the cipher by having the client indicate what it supports, and the server choosing from that list.  If the only choice you offer is the IHE mandated cipher suite, and the server rejects your communication because it doesn't like that cipher, it's the server's problem.

Server Hello
Here's where the more complex failures show up.  The server hello includes a response that indicates which version of SSL or TLS has actually been negotiated.  It is supposed to be the highest version of the protocol supported by both the client and server assuming that when a system supports version X, it also supports any version prior to X (this is not always a valid assumption).  In some secure socket stacks, when you turn on SSL V2, SSL V3, and TLS, and the client asks for TLSv1, the server might incorrectly choose SSL V3 or lower.  That an implementation error in the stack.  The way to control this behavior is to turn everything but TLSv1 off.  And when I say that, I also mean turn off TLSv1.1 and TLSv1.2.  Note, these days you do have to be careful, because asking for TLS means version 1.0, version 1.1 and version 1.2.  Back when IHE started with this, there was only TLS version 1.0.

The next thing that the server does is return its certificate.  The client system is going to need to verify this certificate, and when it does, it can do so a couple of different ways, and can add steps to the validation. There are two validation choices (The client has to be able to support both):

  1. It can simply compare the certificate to certificates it knows and likes.  
  2. It can compare the signature on the certificate to signers that it approves of (all the way up the certificate chain).

Hostname Verification
There are extra verification steps that can be added, one of these being hostname verification.  In common implementations, the receiver of a certificate also verifies that the identity asserted in the certificate matches the identity that the receiver knows the sender by.  This is known as hostname verification.  You'll find a line somewhere in each certificate that reads:  CN=hostname (where hostname is the name of the system identified by the certificate).   During hostname verification, the receiver of the certificate compares this value against the DNS name it used to access the host.  You will need to disable hostname verification.

Why? My first response is because the profile says to do it that way.  My second is to explain that hostname verification does not provide additional security, and introduces an additional failure mode during the setup of the secure communication channel (which is also in the profile).  If you want a detailed explanation of why, you'll have to ask the ITI folks, who are much more versed in security that I am.

Certificate Request
Next the server should ask for a certificate.  When you configure TLS on the server, you can configure it to require a certificate, request but not require a certificate, or omit that step altogether. You want to require the client present a certificate. If you don't set it up that way, then you won't be certain that the clients that connect to you are those that are authorized, and you can fail to detect clients which aren't sending back an appropriate certificate.

When the server asks for a certificate, it identifies which certificates (or certificate signers) it will accept.  This information is configured via the server's trust store.  The trust store contains the certificates (or certificate signers) that the server trusts.  There are sometimes (implementation) limits on the number of certificates that a server will send in the certificate request (which is another reason why trust chains are important).   If you don't have your trust store appropriately configured, the client won't be able to find a matching certificate to send back to the server. If you have too many certificates in your trust store (e.g., more than 128), your server might also not work with some clients (those whose certificates don't get sent).

If the server response to the client hello doesn't include a certificate request, it hasn't been configured properly to require a certificate.  If the certificates it sends in its request do not include either:
  • The certificate your client is using, OR
  • The certificate of the signer of one of the certificates in you client's certificate chain
Then either you are using the wrong certificate in your key store, or the server is not configured with the correct trust store. 

Client Certificate
The client then sends the appropriate certificate back to the server.   One last point of failure at this stage  can be when the certificate (or one in the chain) has expired , which sometimes isn't detected until the server attempts to validate it. That simply means you'll need a new certificate.

Hopefully this will help you debug your TLS communications

Friday, December 16, 2011

Clinical Information Modelling Initiative

There's been a lot of discussion about the Clinical Information Modeling Initiative (CIMI) on the web recently.  You can find several posts from Heather Leslie (@omowizard) on CIMI at her blog site.  Wes Rishel (@wrishel) also wrote about an announcement the group made recently (as did Heather), and offers some advice to HL7.  Sam Heard, chair of openEHR also writes about it recently from the openEHR perspective.  It has also come up on the last couple of HIT Standards Committee meetings, and John Halamka has talked about it a couple of times on his blog in November and December.

I've heard CIMI mentioned by some as the "HL7 buster", by others as being extremely "disruptive" to the existing SDO structures.  Others are afraid that CIMI could replace all of the work that has been done (especially in the US) with CCD and CDA Consolidation, and that it would "completely revamp" Meaningful Use.  I don't think any of those outcomes will happen.  My reasons have to do with what CIMI is doing, and the way that HL7 is structured.  Before I get into that, let's take a look at some of the history.

HL7 and openEHR have been dancing around each other for years.  The two organizations are in many ways competitors for mindshare in the Healthcare Standards space.  Some of their standards overlap, but in other places they are quite complementary.  In fact, during the development of CDA Release 2.0, there was a great deal of harmonization between the HL7 RIM and the openEHR models so that CDA Release 2.0 could be treated as an openEHR "EHR Extract".

Members of HL7, openEHR and ISO have all been working on something called detailed clinical models (in HL7/ISO parlance, in openEHR, you'd call it an archetype) for the past few years (or longer in the openEHR case).  Some folks (like Heather and Hugh Leslie) have been participating across any and all organizations where this work occurs, while others have been tightly focused within smaller circles.  A detailed clinical model addresses, in some ways like a LOINC panel, the information needed to model clinical concepts in all their glory.

DCMs address (in detail), all the information needed to capture something like a blood pressure measurement, as Wes references in his post.  The level of detail can be variable in implementation, but not in the model design.  The model can get to the type of device used to make the measurement, the position of the patient, et cetera. This is the level of detail that is needed to move the interoperability needle forward past the first big chunk, and is also necessary to advance the science of medicine.  I like to think about the model of a problem or allergy, rather than the model for a blood pressure.  In allergies, you have the notion of the allergy, the allergen, reactions, severity of the reaction, incidents of manifestation, et cetera.  All of this is representable today in an HL7 RIM-based model, but what is missing globally is a formal, clinically validated, detailed model of the information, and that is what CIMI hopes to provide.

Large organizations like HL7 which have been established in a particular field, often get "stuck" in a particular kind of thinking, which is hard to break away from.  If you've read any Clayton Christiansen's books on Innovation (Innovator's Dilemma, or Innovator's Prescription), you probably can understand a few good reasons why.  Shortly after I started as an HL7 Board member, the board created the "Fresh Look Task Force", and appointed Stan Huff as chair.  The function of the Fresh Look task force was to see how HL7 could partner with, and foster innovation from outside the HL7 working group.

At the HL7 Australia Working group meeting, the dance between HL7 and openEHR moved into a new stage, no longer just exhibitionary and competetive, but more like a cautious courting stage.  There was a great deal of discussion going on in the Care Provision work group meetings about detailed clinical models.  Stan presented the idea of CIMI to the Fresh Look task force at the HL7 January meeting.  It was agreed by the task force that while CIMI was a good idea, it probably needed to occur outside of HL7 circles, and Stan and others present at that January meeting took it to the next stage with CIMI.

I have to laugh a little at Wes Rishel's advice to HL7, because of the history that I just mentioned.  It's actually pretty good advice for individual members, but as far as the organization goes, it's sort of like telling a mother not to hold back her child (BTW: HL7 is just one of many organizations that can claim some level of intellectual parentage here).  I expect that HL7 will embrace the efforts, especially when you consider the number of HL7 members already engaged.

I haven't been one of the International experts participating in the CIMI efforts for the same reason that I haven't been participating in the DCM discussions in HL7.

  1. The work is more clinical that technical, and I'm not a clinician.
  2. The modeling that they are doing is compatible with (and can be layered above) existing information models (including the RIM) that I'm already familiar with.

Recently, the HL7 board began deliberation about how it might make HL7 domain analysis models available to the CIMI group, so that it could take advantage of existing work.  The outgoing chair, Bob Dolin, is very supportive of these activities, and raised this issue for consideration on the last board call.

I don't think HL7 has anything to fear about CIMI.  Nor do I believe that the outputs of CIMI will render the existing work in HL7, or already in Meaningful Use Stage 1 obsolete.  Instead, CIMI will advance the industry forward a big step.  There will be a need to coordinate and "harmonize" between the CIMI models and HL7 standards, like CDA.  For that, I expect other approaches, such as the one I espoused for CDA R3 using HTML + Microdata, would enable HL7 to harmonize quite readily with CIMI.

My one and only concern about CIMI is that the extensive modeling that has already been done by IHE on perinatal care be considered.  I don't have the bandwidth (or clinical expertise as I mentioned previously) to take that on, but I do hope that some of my IHE colleagues do.


IHE European Connectathon Registration Open

IHE Europe Connectathon Registration Open

IHE Europe is pleased to inform you that the Registration to the IHE European 2012 connectathon is open. The connectathon will take place May 21-25 in Bern, Switzerland. The IHE Connectathon is the healthcare IT industry's largest face-to-face interoperability testing event. IHE profiles enable effective interoperability in a wide range of clinical settings, including health information exchanges. Registration closes January 15, 2012. Registration and further Information >

IHE North America Connectathon Conference

The IHE North America Connectathon Conference is an ideal opportunity for users and developers of health IT systems to learn about and discuss achieving interoperability and making more effective use of electronic health records. The conference is sponsored by IHE USA and will be held Wednesday, Jan. 11, 2012, in association with the IHE North America Connectathon at the Hyatt Regency in Chicago. Registration and meeting information >

IHE Europe Unveils New Testing Services

IHE-Europe has launched the first vendor-neutral service to test the interoperability of health IT systems for health systems and electronic health record projects. IHE-Services will extend the tools and techniques developed for the annual IHE European Connectathon to provide interoperability testing for national e-health projects, HICT users in regional hospitals or vendors developing new platforms.. More >

IHE International Gains ISO Liaison Status

The International Organization for Standardization (ISO) has granted IHE “Category A” Liaison status with the ISO Health Informatics Technical Committee (ISO/TC215)This status allows IHE to participate in the collaborative development and publishing of Standards through ISO. More >

Thursday, December 15, 2011

A summary of my Series of HQMF blog posts for QueryHealth

I've written more than a dozen posts on HQMF and Query Health, and will probably write more. I find myself struggling to remember which post is which, so here is a list of the posts along with a short description of each one.  You can also find a link to this page in my favorites.

1.     SIFramework Face to Face update on QueryHealth
In which I explain why declarative is the way to go for Query Health
2.     Declarative vs. Procedural and Query Health
Some early explorations of HQMF
3.     Value Sets and QueryHealth
Using IHE SVS to address access to value sets.
4.     Putting together the pieces for Query Health
Linking the query criteria to the data model in HQMF
5.     When the XML Sucks
A short side note on how we used to make code easier to read that might address "Greening" in the future.
6.     HL7 HQMF proof of concept with hQuery
In which I show how to transform HQMF XML into hQuery JavaScript using XSLT
7.     Implementing QueryHealth on a CCD Document Collection
In which I show how to transform HQMF XML into XQuery that would operate over a collection of CCD Documents -- again, using XSLT.
8.     Models for Query Health
In which I address the issue of data models and the need for a Simple model.
9.     SQL  implementation for QueryHealth
In which I demonstrate transformation of HQMF XML into SQL (again using XSLT -- are we seeing a pattern here).
10.  Implementing min,  max, first and last
In which I show how do deal with aggregate and ordinal query criteria.
11.  Summary computations and HQMF
Where I deal with averaging and counting. 
12.  Classifying results in HQMF
How to add classifiers or stratification groups to a query.
13.  Greening HQMF
Simplifying the HQMF XML using Business Names and XML restructuring
14.  Loading I2B2 from CDA documents
Say you've got an I2B2 installation and a collection of CDA (CCD) documents.  How would you populate the I2B2 clinical repository?  This post explains it. 
15.  Visualizing criteria in HQMF documents
Sometimes a picture paints a thousand lines of XML much more succinctly.  This post demonstrates how to draw a graph of the query criteria in SVG.
16.  Some principles for creating HQMF
Implementation guidance will be needed for using HQMF with Query Health.  Here are some questions that it must address.

17. Demonstrating Queries for Healthcare
      In which I discuss how queries from I2B2 can be converted to HQMF at a high level.

With all of this writing already done, it seems like there might be another book in play. I'll have to think about that.

Comments due to ONC today on NwHIN HIE Specifications

This showed up in my inbox today.  If you have implemented the NwHIN Exchange Specifications in the US, or XCA, XDS or XDR Internationally, the Office of the National Coordinator would like to hear from you, and the deadline is today.  Answer as many of their questions as you can.  The key questions are also listed below.  Please help if you can.  I'm certain this input will impact Health Information Exchange in the US.

Do you want to shape the official recommendations made to ONC about the readiness of Exchange specifications to be adopted nationally?

Well - here is your chance!  It is imperative that the HIT Standards Committee hears from you.  There is only 1 day left to comment.

Please take a minute and respond to a few short questionsby 5 pm ET on 12/15 by:

 - Posting comments here - - under HITSC input on Exchange Specifications; OR

 - E-mail your responses to: - with “NwHIN Power Team” in the subject line.

I cannot emphasize enough how important it is for you to comment. 

Key questions to address (please respond to some or all) - 1-4, 10, 11 and 13.

The key questions are posted here for your ease of reference:

1.      Please identify yourself, your organization, and your position within the organization.

2.      When did your organization implement the Exchange specifications?

3.      Why did your organization implement the Exchange specifications?   Are the functional capabilities that Exchange provides adequate for your current and expected information-exchange purposes?

4.      Which Exchange specifications did you implement?

10.  Did you implement these specifications as prescribed, or did you make some adjustments for your environment?  If the latter, what adjustments did you make at the time of initial implementation or have you made since?  Were these adjustments made through bilateral agreements or did they apply to all participants in your exchange?

11.  How easy or difficult were the Exchange specifications to understand, interpret, and implement?   Compared to other service-oriented implementations you’ve been involved with, was Exchange easier, harder, or about the same level of complexity?

13.  How many hours of technical time did the project entail before reaching full interoperability?

Thank you!

Wednesday, December 14, 2011

Integrating Schematron Rules and Clinical Terminology Services

On one of the discussions on the Structured Document Work's CCD mailing list, a member notes that Schematron validation such as that found in the NIST Validator doesn't support validation of content using restricted value sets well.  He's right in that Schematron doesn't include an explicit mechanism designed to perform validation against value sets.  But there is a way to integrate a clinical terminology service into a Schematron rule set using the XSLT document() function and Schematron <let> statement along with appropriately structured rules.  The IHE SVS profile was designed to support this sort of validation, as I mention in a previous post.

For validation, their are two practical classifications of value sets: Those that can be enumerated fully in single XML document, and those which cannot practically be enumerated.  The dividing line is based on the available system memory for the Validator.  I would expect that value sets containing hundreds of elements could practically be enumerated in full, but those containing thousands or more terms would not be practically enumerable.  Note that I do not address issues of "static" vs. "dynamic" value sets here, because the value sets can be enumerated dynamically through a web service call.

The mechanism for validating the smaller value sets in Schematron is to create a variable that contains the content of an XML document.  This is done at the top of the Schematron using the <let> statement:

<sch:let name='ValueSet' value='document("")'/>

Later in the Schematron rule set, you'd have a rule context where you would use that variable as follows:

<sch:rule context='cda:manufacturedMaterial/cda:code'>
  <sch:assert test='$ValueSet//svs:Concept[@code = current()/@code'>
   ... report error if concept is not found ...

This idea was used in CDA Implementation guide Schematrons as far back as 2005, in the Schematron use for the Care Record Summary release 1.0.  In that Schematron, an external file was used (so the URL would have been file:voc.xml), rather than an HTTP URL.

Now, when the value set is large (such as a list of LOINC Lab Results, SNOMED CT problems, or RxNORM Drugs), it's not practical to enumerate every term because the resulting document would be very large.  In these cases, you could enhance the SVS defined Web Service to support a code parameter.  When this parameter was present, the service would return the entry for the single code in that value set when present, or an empty ConceptList if it didn't exist.  The rule context remains the same in this case, but the assertion changes:

  <sch:assert test='document(concat("",@code)//svs:Concept'>
   ... report error if concept is not found ...

In this example, the HTTP Web Service request is dynamically created.  If it returns an empty code list, the rule fails, but if it finds the code, the rule succeeds.

So, while Schematron itself does not support "validation against value sets", appropriate integration with a Clinical Terminology Service and a very simple RESTful API does enable it.  I hope that the NIST Validator takes advantage of this approach in the future.

Tuesday, December 13, 2011

Usability != Safety

One of the work groups I co-chair has been addressing issues of usability lately.  Another work group that same organization deals with issues of patient safety.  We often closely collaborate, but the two areas simply not the same.  It's a bit frustrating because I so often see the two combined, but they often have competing interests.

Here is a very simple example illustrating what I mean:  Nearly a decade ago I worked on a product (no longer sold) that performed some analysis of transcribed text that would ease transitions to electronic medical records.  From a patient safety perspective, the analyzed results would need to be clinically verified.  From a usability perspective, many end users simply wanted to skip the clinical verification step because it took them a few extra clicks.  The tension between "efficiency" and "patient safety" in this example is pretty clear.

Many issues of patient safety can very clearly be linked back to "usability design issues", but certainly not all of them.  I've encountered (as a patient), other issues of patient safety that weren't at all related to product usability (in fact, it was product usability that made it possible for me to address the issue as the patient).  Thus, usability != safety.

There are skills taught in the software industry that are generally applicable to usability, and which can be applied to electronic health records.  There are also design and engineering skills that are generally applicable to product safety (and not just medical products) that are also generally taught which can also be applied.  My brother-in-law works on Nuclear power plants, he has skills in risk identification, assessment, and mitigation that could just as easily be applied to aircraft manufacture or medical devices.  These are separate, but often related skill sets.

Whenever you make a device that impacts lives, both skill-sets need to be applied, and the tensions between them need to be balanced appropriately.  But to do that, you first have to realize that they are two separate things.  Certainly they have overlapping areas of concern, and they highly influence each other.

Conflating the two is a mistake.  It fails to recognize that they can sometimes be in tension, and it is often in those very cases where the most attention needs to paid.

Monday, December 12, 2011

Coming soon ...

Well OK, maybe a little bit later.  As I've been predicting, it seems that the Stage 2 Certification proposed rule will show up in mid-February.

What do I expect to see?  Here's my short list of predictions:

  1. HL7 CDA Consolidation (but no Green CDA) to replace CCD (it includes a CCD 1.1), and to add other document formats, and no CCR for Stage 2.
  2. SNOMED, LOINC and RxNORM for vocabularies, and my guess based on this post, is that ICD won't appear. 
  3. Direct and NwHIN Exchange for Health Information Exchange
  4. HL7 Version 2.5.1 for Reportable Labs (LRI), Syndromic Surveillance (ISDS) and Immunization Reporting (CDC)
  5. eRX will continue to allow both HL7 V2, and NCPDP, but may advance the standard versions.
We'll take a look see how well I did when the NPRM comes out.  

   - Keith

Utilizing the Standards & Interoperability Framework

On Monday, December 12, 2011 at 1 p.m. EST, NeHC University will welcome Jitin Asnaani, Coordinator of the Standards & Interoperability Framework (S&I) at ONC, to deliver updates on the progress of the multiple S&I Framework initiatives, which focus on the creation of reusable tools and services to overcome specific barriers to achieving full interoperability. Stakeholders will learn how ONC is working to improve clinical transitions, streamline vocabularies, harmonize current implementation guides, and establish scalable standards and data models for digital certificates and provider directories.

We invite you to join this NeHC University Update on the S&I Framework Initiatives

Friday, December 9, 2011

Role Reversal

He walked over the where the man was pointing at the screen, looked and asked to see the data.  The two of them reviewed the blood pressure results for the last month, looking at the overall pattern, and individual measure results.  They compared this month's average on a beta-blocker against the prior month's without it.  Scrolling through the months, clearly they could see there were no statistically significant results between the two.  Then they looked at the heart rate data.  That showed a clear, nearly 10 bpm average drop, so the drug was doing something.  Next the two of them looked at weight data for the last 30 days.  It's November, and the patient definitely followed the "thanksgiving" pattern of gaining several pounds.  Looking back to Halloween, they could even see a bit of a gain trend starting there.

It was pretty clear what was needed.  The patient has to drop a few pounds gained over the holidays.

This wasn't fiction, it was real.  I was at the doctor's office yesterday, and the guy with the computer was me.  Next year, I'll be able to send him that data so he can look at it ahead of time.  Have I said recently that I love my doctor?

Thursday, December 8, 2011

The Experience of Decision Support

Imagine if you would, a world class clinical decision support system.  How would it work?  What would you see?  Now imagine how that system would be integrated into an Electronic Health Record.  How does it work?  What does it do differently?

Now that you've imagined it, let me tell you about the trap that you just fell into.  You've just imagined that CDS and EHR are separate components that can be integrated together.  In reality, a world class clinical decision support system and world class electronic health record would be indistinguishable from each other.  You wouldn't know where one ended and the other began.  It's no longer about decision making and record keeping.

We usually think about CDS answering questions posed to it based on clinician interactions ... checking to ensure that the medication ordered won't interact badly with anything else that the patient is taking, for example, or picking the best antibiotic as the current point int time.  The real way to think about it is much more clever, but also extremely difficult to work out.

What the system really needs to do is work collaboratively with the healthcare provider in ways that a) make his or her work more efficient, and b) address the myriads of details and minutia of medicine so that the provider can focus more on the patient in front of them.  It is this attention to the provider experience with the system where we need to pay attention.  It's not about doing the physician's work, but augmenting their abilities -- putting the right data in front of them, asking the right questions, and providing critical analysis across large volumes of data, all at the right time and in an easy to understand presentation.

It's a (workflow) optimization problem and a user interface problem.  It's also a trust problem, because until the provider trusts a system to work with him or her, they won't ever use it to its fullest capability.  There may even be something about the way we train doctors to use technology that may need to change before we'll ever be able to develop such a system.  It's worth thinking about.

Wednesday, December 7, 2011

Some principles for creating an HQMF

One of the things mentioned at the Query Health face to face last week was that we were going to need some implementation guidance on how to create an HQMF that works with Query Health.  These are some of the issues that I have to wrestle with right now:
  1. Most queries will need to identify patients in a particular age group, what is the best way to represent that in the HQMF?  There are at least four different "technically accurate" options for representing this:
    1. Assume that the age of a patient will be represented as a (computed) observation in the system performing the query.  This seems to be the simplest, as it allows the age groups to be defined in criteria in a fairly easily readable manner.
    2. Represent birth as an observation, with the effective time of the observation giving the DOB, and representing the age range as a pause quantity between the time of the measure period and the birth observation.  This is challenging because it requires comparison of two separate observations, the measure period and the birth observation, using a variety of HL7 Vocabulary terms to reflect the difference (as a pause quantity).
    3. Represent date of birth (as opposed to birth) as an observation, with the value of the observation being the date of birth, and representing the age range as a pause quantity between the time of the measure period and the value of the date of birth observation.
    4. Represent the age range as a Date of Birth range, and specify that range an interval on the subject's birthdate in the query critieria.  This is very awkward, especially given that the Person class in HL7 V3 is not designed to be "queried by example" using the critieria indicator.  That is to say, you cannot specify an interval on birth date or date of death. 
  2. Other demographics may also be of interest, including gender, race, ethnicity, marital status, religious affiliation, place of birth and place of residence.  There are at least two options for many of these:
    1. Representing demographic criteria as observations.  This allows the typical things to be done on the observation, and supports specification of value ranges for geographic criteria.  We can certainly find LOINC and SNOMED codes to represent the various concepts found on Person  to address the kinds of queries that need to be made.
    2. Representation on the appropriate model elements for the subject, again, with a similar challenges as date of birth/death in that PERSON is not designed to support query by example in the V3 model.  For the coded critieria, it can easily bend this way, but in dealing with geographic designations (postal code, city, state, country and county), it also gets challenging.
  3. Time dependency.  All of the "expressions" I've dealt with have to do with relationship of different acts in time.  There are a couple of ways to handle these expressions:
    1. Using expressions in effectiveTime, activityTime or other act time intervals.  While the XML is simpler to read, the resulting expressions are difficult to interpret and transform to an implementation technology.
    2. Using pause quantity and the HL7 specific value set TemporallyPertains which relates acts by their temporal relationships.  While the pause quantity is clear on the time spent between two acts, what isn't clear is to what this applies.  Is it the clinically effective time (effectiveTime), or the activity time (activityTime), or some other time.  This also leads to complex XML expressions.
  4. Model Relationships.  There needs to be some way to relate data criteria to a stratospheric view of a data model.  
    1. I used the instance of (INST) relationship and very bare details to represent that an act was an instance of a certain type in my prototypes.
    2. The NQF Measures adopt the LOINC codes for information categories used in the CCD and similar guides to identify what elements are being considered, and the detailed act was represented using the Component of (COMP) relationship.
  5. Data Criteria or Data Elements.  While it's described as the data criteria section in the HQMF specification, some implementations have simply used it to identify the data elements, and applied further criteria in the Patient Population Section.  This is reflected in the NQF Measure 59 where the Encounter data element is used in two different places in the patient population section to represent two different things.
    1. I like fully specified criteria in the data criteria section.  That way I can distinguish between different data elements of interest.
    2. Other measure implementations have simply identified data elements.
  6. Counting things.  The MAT generated NQF measures use a rather complex way to express the idea that there have been two encounters for a patient in the measure period.  There are at least three ways to express this:
    1. Use summary criteria to count the number of events meeting a particular criterion.  See this post for details.
    2. Use repeatNumber in the encounter criteria to indicate the specific encounter.  This is a simplification of the previous method.
    3. Use temporal criteria to relate an event to another event.  To identify the second event, you could ask for an event meeting particular criteria within the measurement period that starts after the start of another event that meets that same criteria.  This is OK for second, not great for third, and uglier and uglier after that.  It's a unary counting method.
    4. Use specific subset codes to identify FIRST, SECOND, et cetera.  In this case, you could simply ask for the "SECOND" encounter meeting particular criteria.  I don't like this at all because it isn't generalizable beyond the values provided in the vocabulary, and it's pretty clear that there are an infinite number of ordinal positions.
So far, my preferred solutions are the first option of each of the alternatives listed above, but I haven't considered all the details. Somewhere, these sorts of details will need to be addressed.  IHE QRPH is working on some guidance for population measures, and I will be bringing these issues up there as well as in other places.

Dec-9-2011: Updated the title and added content about Counting Things

IHE Laboratory Technical Framework Supplement Published for Public Comment

Crossed my desk this morning...

IHE Community,
IHE Laboratory Technical Framework Supplement Published for Public Comment

The IHE Laboratory Technical Committee has published the following supplement to the IHE Laboratory Technical Framework for Public Comment in the period from December 6, 2011 to January 5, 2012:
  • Laboratory Analytical Workflow (LAW)
The document is available for download at Comments submitted by January 5, 2012 will be considered by the Laboratory Technical Committee in developing the trial implementation version of the supplement.  Comments should be submitted at

Tuesday, December 6, 2011

IHE North American Connectathon Conference

IHE North American Connectathon Conference 2012
January 11, 2012 in Chicago, ILRegistration is now open.

Join a preeminent cadre of interoperability, information standards, and health information exchange experts for a one-day educational and networking event at the Integrating the Healthcare Enterprise (IHE) North American Connectathon Conference, January 11, 2012 at the Hyatt Regency in downtown Chicago, IL. Register online today!

The IHE Connectathon Conference is a cornerstone of the annual interoperability testing event— the IHE North American Connectathon. This year IHE’s global testing event will hit record breaking participation! Over 120 participating organizations will test 150+ systems advancing the Healthcare IT industry and patient safety. Attendees at the Connectathon Conference will be given special access to the testing floor and a guided tour of the event. 

IHE USA is proud to announce an exciting and dynamic array of speakers and educational sessions for this year’s Conference. Please join us for this important event and register online today. Review the list of educational sessions, plus Conference dates and location listed below.

Connectathon Educational Sessions & Speakers
·         Opening Keynote - Delivering High-value Health Care through Regional Health Information Exchange
Eric Heflin, Chief Technology Officer, Texas Health Services Authority
·         Leveraging IHE XDS to Achieve Health Information Exchange - Real World Implementations                                                                                                   
        Holly Miller MD, MBA, FHIMSS, CMO, Med Allies    
Jim Younkin, IT Program Director, Geisinger Health System, KeyHIE    
·         Current Advancements in Medical Device Integration
Elliot B. Sloane, PhD, CCE, FHIMSS, Professor and Director of Health Systems Engineering at
Drexel University School of Biomedical Engineering
·         Exploring Open Source Tools to Achieve Interoperability - Panel Discussion                                          
James St. Clair, CISM, PMP, SSGB, Senior Director, Interoperability and Standards, HIMSS                                               
Rob Kolodner MD, EVP, CIO, Open Health Tools                                                                               
Ken Rubin, Object Management Group
·         The Next Revolution in Standards-Based Image Sharing                                                                           
David Mendelson MD, FACR, Chief of Clinical Informatics, Mount Sinai Medical Center
·         IHE North American Connectathon Introduction and Guided Tours
Conference Dates & Logistics
The IHE N.A. Connectathon Conference is open to the public. We encourage IHE members to invite interested organizations and individuals that want to learn more about IHE and interoperability to register online.

Conference Date:                            Wednesday, January 11, 2012
Educational Sessions:                     9:00 – 4:30pm CT
Cocktail Reception:                         4:30 – 6:00pm CT
Registration Fee:                             $195.00
Register Online:                                Visit our registration website.  

Meeting Location & Hotel Accommodations:
Hyatt Regency - Chicago, IL.
151 East Wacker Drive
Chicago, IL 60601
Hotel Reservations: Click here.

If you have additional questions, please contact or visit IHE USA’s website for more information.