Friday, July 29, 2016

Round tripping identifiers from CDA to FHIR and back

I think I solved this problem for myself last year or so, or else the answer wouldn't be so readily available in my brain.

The challenge is this: You have an identifier in a CDA document.  You need to convert it to FHIR. And then you need to get it back into CDA appropriately.

There are four separate cases to consider if we ignore degenerate identifiers where nullFlavor is non-null.
  1. <id root='OID'/>
  2. <id root='UUID'/>
  3. <id root='OIDorUUID' extension='value'/>
For case 1 and 2, the FHIR identifier will have system =  urn:ietf:rfc:3986, and the value will be urn:oid:OID, or urn:uuid:UUID.
For case 3, it gets a tiny bit messy.  You need a lookup table to map a small set of OIDs to FHIR URLs for FHIR defined identifier systems.  If the OID you have matches one of the FHIR OIDs in that registry, use the specified URL.  Otherwise, convert the OID to its urn:oid form.  If it is a UUID, you simply convert it to its urn:uuid form.

Going backwards:
If system is urn:ietf:rfc:3986, then it must have been in root-only format, and the value is the OID or UUID in urn: format.  Simply convert back to the unprefixed OID or UUID value, and stuff that into the root attribute.

Otherwise, if it is a URL that is not in urn:oid or urn:uuid format, then look up the identifier space in the FHIR identifier system registry, and reverse it to an OID, and put that into root.  Otherwise, you just convert back to the unprefixed OID or UUID value, and stuff that into root.  In that case, the extension attribute should contain whatever is in value.

So now then, you might ask, how do I represent a FHIR identifier that is NOT one of these puppies in HL7 CDA format.  In other words, I have a native FHIR identifier, and CDA had nothing to do with generating it.  So, there's a system and a value, but no real way to tell CDA how to deal with it.  To do that, we need a common convention or a defined standard.

So, pick an OID to define the convention, and a syntax to use in value to represent system and value when system cannot be mapped to an OID or UUID based on the above convention.  In this manner you can represent a FHIR identifier in CDA without loss of fidelity because CDA does not provide any limits on value.  Oh, and modify the algorithm above to handle that special OID in case four.

I'll let HL7 define the standard, select the OID, and specify the syntax.  I have better things to do with $100 than register an OID for this purpose.  But clearly, it could be done.


Tuesday, July 26, 2016

Don't ask them to tell me what I should already know

This particular issue shows up quite a bit in standards based exchange, and frankly drives me a bit crazy.  Somewhere in the message someone asks for several correlated pieces of information to be communicated.  A perfect example of this is in the HITSP C32 Medication template.  We had to provide an RxNORM code for the medication, a code for the class of medication, and a code for the dose form within the template.  We also had to provide places to put the medication brand and generic names. Folks insisted that all of this information was useful, and therefore must be communicated in the exchange.

However, we used RxNORM codes to describe the medication.  Have you ever looked at the content in the RxNORM Vocabulary?  If I gave you the RxCUI (code) of 993837 in a communication, here's what RxNORM will tell you about it.

Within the terminology model, I can give you the precise medication ingredients and doses within each tablet, tell you that it is in tablet form intended for oral use, identify the brand name, and generic form.  Now, what where you going to do with all of that other information you asked me to code?

Having redundant information is helpful as it helps you spot errors.  If the code is 993837 and the reported medication is something other than Tylenol #3 or Acetaminophen 300 mg/ Codeine 30mg, then there is a problem.  So, it is helpful to have SOME redundancy.  But when all those other codes are also present, the system needs as much knowledge as is already in RxNORM to produce that information, and we just lost some (if not most) of the benefits of using a vocabulary in exchanging the information.

There's so much redundancy in the coded and fielded information in the HITSP C32 Medication template as to be ridiculous (and while I argued against it, I did not succeed).  The RxNORM code is all you need, and just to be sure that the sender knows what it is talking about, one of either the brand name, or clinical name of the drug. Everything else after than is redundancy UNLESS you can identify a specific use case for it.

In an information exchange, you should pay attention to exchanges that duplicate already existing knowledge about real things in the world, especially when knowledge bases such as RxNorm exist. The need to exchange world knowledge between systems exists when the receiver of the communication cannot be expected to be readily aware of all of that world knowledge.  If I ask you to get rid of the dandelions in my yard, it doesn't really help a whole lot to tell you to get rid of the yellow dandelions, unless I have some very specialized varieties of dandelions or I've been watering them with food die.

If you are expecting someone to transmit information that can be inferred from world knowledge, ask yourself if that is truly necessary.  You should always include enough redundancy to enable a receiver to ensure that the sender knows what it is talking about, but don't include so much that a receiver would be overwhelmed, or the sender would basically be duplicating the content of a knowledge source.  After all, we have reference works and reference vocabularies so that we can look things up.


Tuesday, July 19, 2016

Do you have the vision to use HealthIT standards?

One of the challenges of meaningful use is in how it has diverted attention from other uses of standards that could have value.  Use of the Unstructured Document template provides one such example.  Use of unstructured document specifications, either from IHE Scanned Documents (XDS-SD), or CCD-A Unstructured Document supports exchange of electronic text that is accessible to the human reader (CDA Level 1), but not in structured form (CDA Level 3).  A common use case for this kind of text is in dictated notes, something we still see an awful lot of, even in this nearly post-Meaningful Use era.

Some even incorrectly believe that one cannot uses these specifications because they are "forbidden" by meaningful use.  While users of these specifications cannot claim that use towards qualification for meaningful use, that program is NOT specifying what they can or cannot do elsewhere to improve care.  And again, while use of these specifications do not count towards Certification of EHR systems under the 2015 Certification Edition criteria, again, certification is the lower bar.  You can always do more.

There are a number of benefits for using unstructured documents in exchanges where structured detail is not present.  One of these is simply to make the text available to systems that can apply Natural Language Processing technology to the text to produce structured information.  I've worked on two different systems that were capable of doing that with high reliability, and there are commercial systems available to do this today.  This sort of use can be applied to clinical care, or to so-called secondary uses (often related to population-based care or research).

Provider organizations won't ask for this specification by name, rather, they will ask for the capabilities that it supports.  This has always been the problem of standards.  Meaningful Use, MIPS and MACRA eliminate that problem by making systems directly responsible for implementing standards, and making providers care about that by affecting their pocket book when they use systems that do not.

The challenge for good systems developers and product managers is in applying standards like these to system design, without having their customers asking for it directly.  That takes vision.  Do you have the vision to use standards that you aren't required to?  

Friday, July 15, 2016

A FHIR Resistant Guantlet

Someone asked me for my opinion on adopting STU3 resources in an environment that presently supports STU2.  At first this seems to be quite challenging.  But then, as I thought about it, the following idea occurred to me:

It would be a simple matter of engineering to take an STU3 StructureDefinition, and re-write it as an STU2 StructureDefinition that is a profile on an STU2 Basic resource. Such a structure definition would be ideally suited for transfer to an STU3 environment when it is available, but would work in an existing STU2 environment today.

It eliminates one of my objections to pre-adoption of new resources, uses the Basic resource in a way it is intended to be used (to prototype new stuff), and provides a useful way to test new stuff in existing environments.

I don't have the time personally to write such a tool, but would love to see someone take up this gauntlet I just threw down.

-- Keith

P.S.  Such a tool could also support changed resources, if the tooling was smart enough to understand certain kinds of changes.  It could create extensions for new fields added, restrict fields that are removed, and ignore those for which there were simple name changes (detectable perhaps through a combination of w5, V2 and V3 mappings).

Monday, July 11, 2016

Building software that builds software enforces quality

There's an interesting discussion over on the AMIA Implementation list server about software quality. As is often the case in many of my posts, it intersected with something I'm currently working on, which is a code generator to automate FHIR Mapping.  Basically I'm building software that builds software.

You find out a lot of things when you do that.  First of all, it is difficult to do well.  Secondly, it's very hard to create subtle bugs.  When you break something, you break it big time, and its usually quite obvious.  The most common result is that the produced software doesn't even compile.  The reason for this has to do with the volume of code produced.  A software generator often creates hundreds or thousands of times more software than went into it.  And even though it is difficult to do well, when you do it that way, you can produce 10, 20 or even 100 times the code that you would otherwise, with extremely high quality.

Software generators are like factories, only better, in this way.  If a component on the assembly line is bad (a nut, a bolt, et cetera), that results in a minor defect.  But ALL of the materials produced by a software generator with a very small exception are created by automation.  And a single failure anywhere in the production line halts assembly.  You get not one failure, but thousands.  The rare one-offs that you do get can almost always be attributed to bad inputs and a rare cosmic radiation event. Most of  the time we are dealing with electrons and copies of electrons; we never have just one bad bolt.  We have thousands of copies of that bad bolt.

Another interesting thing happens when you use code generators, especially if you are as anal retentive as I am about eliminating unnecessary code.  You will often find that the code you are generating is exactly the same with the exception of a few parameters which can often be determined at generation time.  When this happens, what you should do is base the class you are generating on a common base class, and move that code to the base class, with the template parameters that you specify.  This is a great way to reduce code bloat.  Rather than having fifteen methods that all do the same thing, the superclass method can do the common stuff, and the specialized stuff can be moved to specialization methods in the derived class.  Template parameters can help here as well.

In the example I'm working on, my individual resource providers all implement the basic getResourceById() method of a HAPI IResourceProvider in the same way, with specialized bits delegated to the derived class.  That's 35 lines of code that I don't have to duplicate across some 48 different FHIR resources.  I really only have to test it once to verify that it does what it is supposed to do in 48 different places.  If I was writing that same code 48 times, I guarantee that I'd do it differently at least once (if I didn't go crazy first).  No sane engineer would ever write the same code 48 times, so nobody using code generation should make the software do it that way either.

I once worked with a developer that generated a half million lines of code in this fashion.  In over two years of testing, implementation and deployment, his code generated a grand total of 5 defects.  For him, we could translate the traditional defects/KLOC metric to defects/MLOC, and he still still outperformed the next best developer by an order of magnitude.

That, my friends, is software engineering.


Friday, July 8, 2016

Thoughts on the HIT100

If you've been following Health IT for any length of time on Twitter, you are probably familiar with the now annual HIT100 "competition".  I've done pretty well over the years in being recognized by my fellow tweeters.  In 2011 I came in second, tenth in 2012 and 2013.  In 2014 I never saw the official results, but unofficially I was somewhere in 42nd to 44th place.  In 2015, the HIT99 was announced, and I showed up in 22nd place.  This year I'm probably in the top 50, but haven't really paid much attention to it, except for the nominations in my stream.

I liked the HIT100 when it first came out, because it truly did introduce me to new faces in Health IT. Over the years, it's become less relevant, somewhat steeped in contention between competing parties, and clearly more of a popularity contest at the top levels.  BUT, it's still a good list of people to pay attention to.

What it has stopped being is a place where I can identify interesting people to watch that I'm not already paying attention to.  I know most of the people on these lists.

Michael Planchart (@theEHRGuy) has done us all a great service in starting the HIT100, and regardless of what I've heard about motivations, I also thank John Lynn (@techguy) for ensuring that something like it continued in 2015.  And, I'm glad Michael's back in the saddle for 2016.  I'm staying out of the middle of any debate as to the virtues of vices associated with the 2014 and 2015 contests.

I look forward to seeing the results.  I know some of who has been trolling for nominations, and like I've previously said, the standings at the top ten don't matter much to me.  I hope the event continues, and with less angst than in prior years.

But I'd also like to see a new event celebrating people we wouldn't otherwise hear about or notice, perhaps because they have something valuable to say, but don't know how to say it so that we can all hear it, or perhaps because they are new in the field and we just haven't noticed them yet.  I'm not going to start it (at least any time soon), and would love to see someone take this idea up and run with it.


P.S. For all of you who nominated me this year, I am extremely grateful.  There's been some lovely GIFs in those nominations which I share below, along with some truly satisfying feedback.  I'm trying to get back into blogging (its been months since I posted two days in a row!) as I finish up my degree, and get steady in my new role(s).

Thursday, July 7, 2016

The Value of Standards Maintenance

We recently looked at two fairly simple issues on the DSTU Tracker on CCDA 2.1 in Structured Documents.  I thought I'd take the opportunity to break down the cost of this sort of effort.

To establish costs, we first need to establish the cost for one of the people involved.  Some of the effort in maintenance is discovery and diagnosis within an organization of an issue that needs to be raised up to the maintaining organization.  For US developers, I use an estimate of about $75/hr in costs, which averages over a range of salaries from Junior to Senior developer.  For senior staff, (architects, analysts and others) involved in HL7 standards calls, You can double or triple that, because these staff are much more senior.  I'd call it about $200/hr for that group.

On today's call, we addressed two simple issues.  Each of these issues likely involved about 4 hours of development time, 3 to analyze and assess, and 1 to convey the resulting problem to the organizations HL7 representative.  That representative has about two hours of preparation effort associated with each issue, 1 hour to learn and assess the issue with their developer, and another hour of research into the issue, resulting in the development of a change proposal.  Each issue took about 15 minutes to discuss, assess, and agree to a change on the call, with 24 people on the call ($4800/hr!)

4.0   *     75 =   300
2.0   *   200 =   800
0.25 * 4800 = 1200

So, for the two issues we addressed today, the approximate cost was almost $5,000, and these were SIMPLE ones.  More complex ones take much longer, often two or three committee calls, sometimes with the committee spending as much as two hours on the issue before coming to a resolution, and in many cases, between committee review, there's a smaller workgroup assessing the issue and proposing solutions (call it five people), which usually spend 2 or more hours on the topic ($1000 or more).

So, a single issue has a cost born by industry volunteers ranging from $2,300 to as much as $15,000.

Consolidated CDA has 255 comments against it today, each of which has been addressed at some point by the working group.  The value of this maintenance ranges from $0.5 to $3.8 million!

To come to a more precise estimate, we have to estimate how many of these issues fall into the easy category, how many fall into the hard, and how many are somewhere in between.  I'd estimate about 10% of issues are in the difficult category, about bit more than half in the easy category, and the rest somewhere in the middle (which I'd estimate costs about $6K).  It works out to about $2.1M for maintenance of CCDA 1.1.