Convert your FHIR JSON -> XML and back here. The CDA Book is sometimes listed for Kindle here and it is also SHIPPING from Amazon! See here for Errata.

Thursday, July 19, 2018

Sweat the Small Stuff

Small things can sometimes make a big difference.  The difference between an adequate piece of equipment and an excellent one can be huge.  Sometimes the things that you need to change aren't your equipment, but rather yourself.  That's more than just money, it's time and effort.  It's hard.  It's annoying.

The way that small things can make a big difference is when they provide a small but consistent improvement in what you do.  Keyboards for example.  Today I had to switch back to a crappy backup keyboard because the 5 and 7* keys on my Unicomp keyboard died.  I can already feel the change in my typing speed.  More than half my work is typing, and the better keyboard is the difference between 70 WPM and 75 WPM.  That's a 6.667% difference in speed.  It's small, I can live with it for a short period of time.

What will using the cheaper keyboard cost me?  Well, I don't spend all my typing time at top speed, so really, it only impacts 50% of my work. But, for that 50%, that's the most productive time I have, because the other time is meetings and overhead.  So now I'm losing not just 6.667% of my time, I'm actually missing it out of my most productive activity.

Amortized over a year, that's a whole month of my productive time that I somehow have to make up for.  There goes vacation.  All for lack of a minor improvement.  I'll probably get the Unicomp repaired (it's a great keyboard when it works), but I've got a better one on order with Cherry MX blue switches.  They have a similar feel to the spring-switches in the Unicomp IBM-style switches and are the current "state-of-the-art" for typists as best I can tell.  And if it breaks, I can replace the dang switch, which I cannot do on the Unicomp without about two-three hours of effort.

A colleague and I were talking about how making personal changes can help you in your efforts.  His point was that many cyclists spend hundreds (or even more) to reduce the weight of they bicycles by a few more ounces to get greater hill-climbing speed.  He noted that losing a few pounds of personal weight can have a much greater impact (I'm down nearly 35 lbs since January, but my bike has never had a problem with hill-climbing, so I wouldn't know about that).

Learning to touch type was something I succeeded (if you can call a D success) in doing in high school, but never actually applied (why I got the D) until someone showed me that I was already doing it (but only when I wasn't thinking about it).  After discovering that, over the next six months, I went from being a two finger typist to four, and then to eight, and then to ten.  That simple physical skill has a huge impact on my productivity.

I now make it a point, when I learn a new application to understand how to operate it completely without removing my fingers from the keyboard.  And I train myself to operate the applications I most commonly use to learn them that way because it makes a small difference that adds up.  It's an almost meaningless small thing that greatly improves my productivity.  Yeah, I ****ing hated it when Microsoft changed the keyboard bindings in office (and I still remap to some that I have long familiarity with), but I spent the time to learn the new ones.  It ****ed me off for six months, but afterwards it paid off.

Here's where this starts to come into play in Health IT.  We KNOW that there are efficient and inefficient workflows.  We KNOW that changing workflows is really going to yank people's chains.  How do we get people to make even small changes who want to keep doing things the way they always have been?   And more importantly, what is going to happen to those non-digital-natives who have to adapt to an increasingly more digital world when their up and coming colleagues start having more influence.

When we get rushed, we let the small stuff slip.  It's a little bit more time, a little bit more effort.  And the reward is great and immediate, we get more done.  But the small stuff has value.  It's there to keep us from making mistakes.  Check in your code before the end of the day ... but I'll have to take a later train ... and now your hard drive is dead tomorrow, and you have to redo the day's work.  Which would you rather have?

Sweat it.  It's worth the effort.

So, what small thing are you going to change?  And what big difference will it make?


* 7 is pretty darn common in hash tags I use, and in e-mails I write.  That's pretty dang frustrating.

Wednesday, June 20, 2018

Add, Replace or Change

Most of the interoperability problems we have today can be readily solved.  All we have to do is replace the systems we already have in place with newer better technology.  Does that sound like a familiar strategy?

If you've been involved in Meaningful Use or EHR certification activities over the last decade, then it certainly should.  The introduction of CDA into the interoperability portfolio of vendor systems has created a whole host of new capabilities.  We added something new.

HL7 Version 3 attempted to replace HL7 Version 2 and it basically was a flop for most of its use cases, in large part because of the tremendous investments in existing infrastructure that would have to be replaced, and which in large part met some viable percentage of the capabilities that the end users were willing to live with and WEREN'T willing to spend the funds to replace with a more capable (yet complex and expensive) solution.

CCDA was a very effective change to the existing CCD and IHE specifications, and incorporated more or less modest changes.  It may have been more than most wanted, but was at least little enough to retain or enhance existing technology without wholesale replacement.

FHIR is a whole new paradigm.  It adds capabilities we didn't have before.  It replaces things that are more expensive with things that can be implemented much more easily and cheaply.  And it changes some things that can still be adapted.  For example, an XDS Registry and Repository infrastructure can be quickly modified to support FHIR (as I showed a few years back by building an MHD (IHE Mobile Access to Health Documents) bridge to the NIST XDS Registry and Repository reference platform).

The key to all of these improvements is to ensure that whatever you are adding, replacing or changing: The costs to the customer (or your own development) are going to be acceptable and adoptable by them (or you) and the rest of the stakeholders in an appropriate time frame.  FHIR has succeeded by taking an incremental approach. 

The birth of FHIR was almost seven years ago.  FHIR at 6 and 7/8ths years old (because young things care about halves and quarters and eighths) is doing quite well for itself. In that time, it has come a very long way, and very fast.  Version 3 never had it so good.  The closest standard I can think of that had anything close to this adoption curve was XML, and that took 2 years from initial draft to formal publication (FHIR took 3 to get to its first DSTU), and I expect widespread industry adoption to the final form (Release 4) to be well inside 2 years.  Whereas it took XML at least 3 and some would say more (although it's industry base was much larger).

So, as you think about how we should be improving interoperability, are you Adding something new, Changing something that already exists, or Replacing something?  Answer that question, and then answer for yourself the question of how that is going to impact adoption.

Wednesday, June 13, 2018

Why AI should be called Artificial Intuition*

This post got started because of this tweet:
The referenced article really isn't about AI, rather it's about an inexplicable algorithm, but a lot of "AI" fits into that category, and so is an appropriate starting point. Intelligence isn't just about getting the right answer. It's about knowing how we get to that answer, and being able to explain how you got there. If you can come up with the right answer, but cannot explain why, it's not intelligent behavior.  It might be trained behavior, or instinctive or even intuitive behavior, but it's not "intelligent".

What's been done with most "AI" (and I include machine learning in this category) is to develop an algorithm that can make decisions, perhaps (most often in fact) with some level of training and usually a lot of data.  We may even know how the algorithm itself works, but I wouldn't really call it intelligence until the system that implements the algorithm can sufficiently explain how its decision was reached for any given decision instance.  And to say that it reached that decision because these vectors were set to these values (the most common form of training output) isn't a sufficient explanation.  The system HAS to be able to explain the reasoning, and for it to be useful for us, that reasoning has to be something we (humans) can understand.

Otherwise, the results are simple mathematics without explanation.  Let me tell you a story to explain why this is important:

A lifetime ago (at least as my daughter would measure it), the company I worked for at the time obtained a piece of software that was the life's work of a physician and his assistant.  It was basically a black box that had a bunch of data associated with it that supported ICD-9-CM coding of data.  We were never able to successful build a product from it, even though we WERE able to show that it was as accurate as human coders at the same task.  In part, I believe that it was because it couldn't show coders (or their managers) HOW it came to the coding conclusions that it got to, and because that information wasn't provided, it failed to be able to argue for the correctness of its conclusions (nor could it could be easily trained to change its behavior).  It wasn't intelligent at all, it was just a trained robot.

Until systems can explain how they reach a conclusion AND be taught to reach better ones, I find it hard to call them intelligent.  Until then, the best we have is intuitive automata.

For what it's worth, humans operate a lot on gut feel, and I get that, and I also understand that a lot of that is based on experiential learning that we aren't even aware of.  But at the very least, humans can argue for the justification their decision.  Until you can explain your reasoning to a lesser intelligence (or your manager for that matter), you don't really understand it.  Or as Dick Feynman put it: "I couldn't reduce it to the freshman level. That means we don't really understand it."


P.S. The difference between artificial and human intelligence is that we know how AI works but cannot explain the answers it gets, whereas we don't know how human intelligence works but humans can usually explain how they arrived at their answers.

* Other Proposed Acronyms for AI

  1. Automated Intuition
  2. Algorithms Inexplicable
  3. Add yours below ...

Monday, June 11, 2018

Why we'll never have Interoperability

I don't know how many times, I've said this in the past.  Interoperability is NOT a switch, it's a dial.  There are levels and degrees.  We keep hearing that Interoperability doesn't exist in part because every time someone looks at it, the expectation of where the dial should be at doesn't meet provider expectations.

Few people working today remember early command line interface for accessing mail (that link is for the second generation command line).  Most people today use some form a GUI based client, many available over the web.

To address this in this post, I'm going to create a classification system for Levels of Interoperability.
0AbsentWe don't even know what data is needed to exchange to solve the user's problem.  We may know there's a problem, but that's about as far as we can get.
1AspirationalWe have an idea about what data is needed, and possibly even a model of how that data would be exchanged.  This is the stage where early interoperability use cases are often proposed.
2DefinedWe've defined the data exchange to the point that it can be exchanged between two systems.  A specification exists, and we can test conformance of an implementation.  This is the stage that most interoperability in EHR systems achieve after they've gone through some form of certification testing.
3ImplementableAn instructional implementation guide exists that describes how to do it.  This is more than just a specification.  It tells people not just what should appear where, but also gives some guidance about how to do it, some best practices, some things to consider, et cetera.  This is the stage that is reached when a specification has been widely implemented in the industry and you can find stuff about it on sites like "Stack trace".
4AvailableThis is the stage in which most end-users see it.  Just because some feature has been implemented doesn't mean everyone has it.  We've got self-driving cars.  Is there one in your driveway?  No.  Self-driving cars are not available, even though several companies have "implemented" them.  The same is often true with Interoperability features.  Many vendors have implemented 2015 Certified EHRs, but not all providers have those versions deployed yet.
5UsefulThis is the stage at which users would rather use the feature than not, and see value in it.  There's a lot of "interoperability" that solves problems that just a few people care about, and creates a lot more work for other people.  If it creates more work, it's likely not reached the useful stage.  Interoperability that eliminates effort is more useful.  There are some automated solutions supporting problem, allergy and medication reconciliation that are starting to reach the "useful" stage.
A good test to see whether an interoperable solution has reached this stage is to determine how much the end-user needs to know about it.  The less they need to know, the more likely it's at this stage.
11DelightfulAt this stage, interoperability becomes invisible.  It works reliably, end users don't need to know anything special about it, et cetera.  The interesting thing about this stage is that by the time a product has reached it, people will usually be thinking two or three steps beyond it, and will forget about what they already have does for them.

The level of interoperability is often measured differently depending on who is looking at it and through what lens.  The CFO looks at costs and cost-savings associated with interoperability.  Is it free? Does it save them money?  If not, they aren't likely to be delighted by it.  The CIO will judge it based on the amount of work it creates or eliminates for their staff as well as the direct and indirect costs it imposes or reduces.  The CMO will be more interested in understanding whether it's allowed them to reach other goals, and will judge by different criteria.  And the end-user will want their job to be easier (at least with regard to the uninteresting parts), and to have more time with patients.

By the time you reach "delightful" (often much before) you get to start all over again with refinement.  Consider the journey we've been on in healthcare with the various versions and flavors of HL7 standards.  HL7 V1 was never more than aspirational, V2 was certainly defined, though the various new features sub-releases also went through their own cycles.  Some features in HL7 V2 even got to the level of delightful for some class of users (lab and ADT interfaces just work, most providers don't even know they are there).  By the time the industry reaches perfection, users and industry are already looking for the next big improvement.

Do we have electronic mail? Yes.  Is it perfect yet? No.  Will it ever be?  Not in my lifetime.  We'll never have perfect interoperability, because as soon as we do, the bar will change.

Friday, June 8, 2018

Resolved: Prelogin error with Microsoft SQLServer and JDBC Connections on Named Instances of SQL Server

Just as I get my head above water, some other problem seems to crop up.  Most recently I encountered a problem connecting to a SQL Server database that I've been using for the past 3 years without difficulty. 

We thought this might be related to a problem with the VM Server as that was demonstrating a problem, and in fact after a restart of the VM Server, I was able to access a different vendor's database product on a different server without problems that was also causing me some grief, but I still couldn't access SQL Server.

Here were the symptoms:
  • Connections worked locally on the Server.
  • Connections worked inside the Firewall.
  • Connections that were tunneled through some form of VPN weren't working at all (with two different kinds of VPN).
I was pretty well able to diagnose the problem as being firewall related, but there's at least four between me and that server, and I only have access to two of them, and unfortunately could find no information in the firewall logs to help.  Wireshark might have been helpful except for some reason I couldn't get it to access my network traffic on the 1433 port that I knew SQL Server used.

If you google "prelogin error" you'll see a ton of not quite helpful stuff because for the most part, nobody seems to get to the root causes of my particular problem, but I finally managed to do so.

Here's what I discovered:

My firewall was blocking port 1434, which is the port that the SQL Server Browser service uses to enable named instances to find the right SQL Server service to connect to.  But even after opening that port, things were still not working right, the connection was failing with a "prelogin error".

One of the posts on the Interwebs pointed me to a Microsoft diagnostic tool used to verify SQL Server Communications.  The output of that tool contained something like the following:

Sending SQL Server query to UDP port 1434...

Server's response:

ServerName SERVER1
InstanceName SQL_SERVER_1
IsClustered No
tcp 1433

ServerName SERVER1
InstanceName SQL_SERVER_2
IsClustered No
tcp 59999

What this told me was that the server I wanted to access was listening on a port other than 1433.  And of course, that port was blocked (which explains why Wireshark wasn't helping me, because I was looking just at port 1433 traffic).  Setting up a firewall rule to allow access to any port used by the SQL Server service resolved the issue (since I couldn't be sure the dynamically assigned port would be used again the next time the server was restarted).

I think part of the reason that nobody has been able to clearly state a solution is because if I'd been trying to connect to SQL_SERVER_1, the already existing rule I had for port 1433 would have been JUST fine, and I wouldn't have needed another rule.  And so the published solutions worked for maybe half the users, but not others.  And some published solutions suggested multiple different ways to configure Firewall rules, some of which would have worked some of the time, and others (like mine) would work all of the time.

I realize this has nothing to do with standards, but at least half of those of you who read this blog have had your own run-ins with SQL Server.

Now, you might scratch your head and wonder how this worked before, and what happened to the Firewall rules that enabled it to work.  For that I have a simple answer.  We had recently rebuilt the VM so that it had more resources to do some larger tests, and so the system was redeployed under the same name to a new operating system environment.  And my challenge happened to overlap a) the redeployment, and b) the VM having to have been rebooted.

Root cause analysis is a royal PITA, but having invested quite a few hours with it for this problem, I'll never have it for more than a few minutes again, and now hopefully, you won't either.


Thursday, June 7, 2018

Building Interfaces with FHIR

Almost three years ago I created some tools to extract trigger events from HL7 V3, and then made a list of HL7 Version 2 and 3 trigger events.  I came up with over 1000 of these.  FHIR also supports messaging and has other mechanisms to trigger activity (e.g., Subscription) that can be used to trigger actions.

Use Case

Recently I've been looking at a way to automate data collection for 340B reporting, which requires collection of information about the patient, prescriber and medication from MedicationOrder resources.  This program, like many others, requires a small subset of the FHIR Resource data in order to manage the reporting operations.  Like many other programs, the implementers would very much like it if they don't have to wade through hundreds of lines of JSON or XML in order to get the data.  Both data providers and receivers would rather deal with only the minimum necessary information even when it is perfectly OK to have more (e.g., either because of the protected status of a receiver such as public health, or because the data is for treatment rather than payment and operations).

In looking at how this interface would work, there's basically a 4 step process:

  1. Some trigger event occurs which indicates that there's some data to consume.
  2. One or more queries are executed to obtain the necessary data in a bundle.
  3. The resulting data set is transformed into a format that can be more readily consumed.
  4. The data is sent to the receiver.
FHIR Subscriptions can be used to handle MOST of the different trigger event cases.  Search or Batch transactions can specify the data collection aspect.  Sending the data is straightforward.  What isn't so straightforward is transforming the data into a simpler format for the consumer, but there is also a way in FHIR to handle transformation from one structure to another (see StructureMap), and FHIR also has a mapping language defining the transformation.  The Clinical Quality Language provides another mechanism by which data can be accessed and transformed. XSLT would also work for this if one transformed the bundle as XML content.

Right now, the pieces are nearly all there, almost everything needed already exists in Subscription and StructureMap to put it together.  They just aren't put together in a way that allows the interface to be completely defined in an easy fashion.

Most of the detail needed is in how we define the third step above.  Here's how I'd put this together:
  1. I'd define a Subscription for the interface would include the search criteria (in Subscription.criteria) that would essentially identify the trigger event.
  2. The same query used to identify the trigger event would ALSO include any other information essential for producing the data needed in the interface (e.g., _include and chained queries).  That means that any included resources would be defined in the query used for Subscription.criteria.
  3. In order to transform the data, we need to specify a mechanism by which the content can be transformed to another format. StructureMap has what I need but implementing it for the most common use cases feels like overkill, and it's still a work in progress.  I have working code that could do this with CQL today, so that's likely where I would go.  Also, StructureMap is R3/R4 content, and I have a server that's currently deploying STU2.  Everything else I need is already in Subscription in DSTU2. 
  4. For my purposes, I would set would be set to rest-hook and would be set to the endpoint that should receive the report.  The format of the transformed result would be reported in


Subscription doesn't have a way to specify the transform details, so the next step would be to define an extension to support that.  Here's what I'd add to Subscrition to support this:

First of all, I'd add an element called transformation to channel which would be an optional backbone element (cardinality [0..1] because there should be at most one for a channel) describing the transformation to apply to the queried content to specify how the channel data (a Bundle) would be transformed before sending to the endpoint.  It would have two fields, method and transform.  Method describes the method by which the transformation occurs, and transform provides a reference to the content defining the transformation. Method would be a code, containing values from an extensible vocabulary including cql, fhirmapper, operation, structuremap and xslt.  These are further described below:

cqlCQLTransformation is defined in CQL as the tuple returned in the "return" definition, where the payload provides the source material accessed by the CQL rule.
fhirmapperFHIR MapperTransformation is defined in the FHIR Mapping language.  The source is the payload that contains the content associated with the query used in subscription.
operationOperationTransformation is defined as the result of a FHIR operation, where the source parameter for the transformation is the payload.  
structuremapStructureMapTransformation is defined in a StructureMap resource.  Source is the bundle that would result from the payload provided by the subscription.
xsltXSLTTransformation is defined in an XSLT transform.  The input is the payload provided by the subscription.  The output of the transform is what would be sent to the receiver.  While input would be in XML, output could be in any format (you can, for example create JSON and CSV files with XSLT).
The transform field would be a resource reference (StructureDefinition, OperationDefinition) or URL reference to the content of the transform.

XSLT, fhirmapper, structuremap would implement the transformation in an application defined manner that really needs no more specification.  The input and output are fairly well understood for these types, and how the transformer is called is really not that interesting.

For operation and cql, a little more detail would be needed to define how the payload is communicated, and the result of the transformation is extracted from the transformation.  A transformation operation should ideally be idempotent, meaning it causes no changes (with the exception perhaps of audit records) on the server.  


When an operation has only one output parameter named "return", that's what gets returned.  This works well in a transformation scenario.  The only further definition that is needed is how to specify the input that the transformation operates with.  What I'd probably want to implement for operation (because it is simplest) is to define the operation and any of its parameters in a URL, with POST body of type multi-part/form-data where the source parameter defines the input of the transform in the body of the post.  In that way, fixed parameters of the operation transforming the content can be specified in the transform URL, and the body can be the part that is sent to the receiver.  I'd probably add a parameter to the operation called endpoint and another called header to which I'd pass the values of the subscription endpoint and header parameters.  In this way, I could fully define in a Subscription how the server would implement the transform, and offload the work of sending the transformation to the endpoint.

NOTE: Operation can return content OTHER than a FHIR Resource by using a return type of Binary.  In HAPI, when you do this, what happens is that the returned content is simply the content of the binary resource.  I've successfully used this to return CDA and JSON content from operations I've defined.  This eliminates a step for integrators of extracting the content from a FHIR Parameters resource, which is something many would rather NOT have to do.


For CQL, things get a little bit more complicated.  Executing a set of CQL rules produces results in a name/value pair.  The tricky bit is figuring out how to integrate the payload content (the input) into the CQL execution, but that can be implementation dependent.  The simpler part is to figure out how to get the output.  There are two ways I can see to deal with this: 
  1. Return all defined values as a JSON Object. This works, but some of the defined values created in a CQL rule might be used internally, whereas others are meant for external consumption.  I have a CQL Operation created that executes a CQL rule, and allows you to specify which parts of the rule to return in a JSON object.  This is OK, but you can get just as much control by saying the result to be returned is in an object with a predefined name.  
  2. Return the defined object using the name "return".  This is my preferred solution, as it allows the rule to have a single result (which can be a tuple), mirrors the OperationDefinition pattern, and allows the transform to be computed and extracted.
What I'd probably do in my own implementation is provide support for OperationDefinition to start with, and build an operation definition framework that would let me create operations to handle cql, fhirmapper, structuremap and xslt methods.  Then, in my subscription handler, I could translate those other forms into an operation call, and let it do the transformation and notification work.

This suggests an additional capability that would be helpful in the Operations framework to FHIR, which is the ability to define where to send the response to the request, and what header parameters to supply for that interaction.  Typical RESTful protocol is in request/response pairs, but there are times when what you really want to do is send the response somewhere else as a notification.

I'd have to think carefully about the security ramifications for this, but my current thinking is that any FHIR HTTP request could have an _endpoint and _header parameter that works in a way similar to the way these parameters are used for a rest-hook subscription.  When these parameters are given, they specify the restful endpoint that would receive the resulting content, and the headers to send with it.  These parameters could then enable queries, operations or other content generating requests to send notifications to a specified URL.  When these parameters are used, the returned result of a query, operation or other FHIR API call could return the following values to indicate success:

Response CodeMeaning
202 AcceptedThe request was accepted, is syntactically correct, and the response will be sent to the requested endpoint but may not have been sent yet.  This essentially is a response that says, yes, I got your request and it looks good, I'll send the results along to the designated endpoint, but you have no guarantee that it actually got there.
204 No ContentThe request was accepted, is syntactically correct, and the response was sent to the requested endpoint and was accepted by it.  This tells you the response was sent and successfully acknowledged by the requested endpoint.


  • Create global parameters for _header and _endpoint and add code to handle them (this is a good place to consider HTTP or HAPI Interceptors).  
  • Add extensions to Subscription.
  • Figure out how to change operation to use PUT instead of POST (for rest-hook implementation).
  • Build a transformer operation.

But Wait!

If I define _header and _endpoint parameters for a FHIR operation, then I don't even need an extension on subscription.  Instead, what my endpoint would look like would be:

[base]/$transform?_header=Header for Eventual Receiver&_endpoint=Endpoint for Eventual Receiver&transform=Transform content to use&type=Transformation Type&_method=PUT

Now, I can define my operation easily, skip the extension stuff, and the only real difference is that I'd need to make the operation support PUT as well as POST, OR, make my subscription implementation smart enough to figure that part out when the server is calling one of its own operations with content.  I can add other parameters to support URLs or content to specify the details for the transformation.

OK, thanks for letting me bounce this off of you.  This was a slam-dunk.

Monday, June 4, 2018

Coloring within the Lines

When you first learned to color as a child, someone handed you a picture (like the following) and a bunch of crayons:
At first, you probably just threw blotches of color on the page, ignoring the lines. Then someone explained to you about keeping color within the lines. You practiced and practiced, and then realize that you could do really cool horse pictures within the lines. Then you turned your horse into a Zebra.
Not content with that, you made it into a Unicorn ... and later a Pegasus.

This is a metaphor for interoperability journey we started upon with Meaningful Use in 2010 and where we are headed in 2018.

First we had data that we needed to exchange, and a name for it (we called in the common clinical data set, but a horse by any other name is still a horse).  And we watched as everyone struggled to stay within the lines (the CCD specification).  Some even created a color-by-number guide that made it easier to make a pretty CCD.

When creating this horse picture, we stayed inside the lines with HL7 CDA, but to meet certain ONC requirements to make our horse look like a Zebra, we had to change the way we colored a little bit, but still came up with something that fit the lines (it just had a few more lines), and then we had our Zebra.

At times though, there were things that just didn't fit, and so we had to come up with a horn, or wings.  We did our best to create new lines that fit with (but not within) the existing lines, and went beyond the original drawing but still had the same general appearance, producing the C-CDA.

After a while, there's not much more to do with our horse (CDA), so we start with a new drawing (FHIR).  This horse will still get you from point A to point B, but much faster (no wings needed).

And that's progress in interoperability.

Wednesday, May 30, 2018

Are you lost on how GDPR is related to Health IT standards?

Yesterday, after posting for the first time in a while, Google alerts me that as a publisher of content on THEIR technology platform, using THEIR tools, that I have a responsibility to notify you (if you happen to live in a European Economic Area) of how I use your personal data, and if necessary obtain your consent, as a result of the General Data Protection Regulations (GDPR).

Conveniently, I don't use your personal data.  Instead, I let Google do it all for me.  If there are cookies generated by this site, they are generated by Google software.  If there is tracking being done by geography, cookie, topic or other data you supply to access this site.

If you happen to be reading this blog through a syndicated site, well, all bets are off because I don't control how those who syndicate this content use your personal data, but they are subject to the same requirements as I am.

I don't use cookies for Ads (because I don't do AdWords or anything like that here).  I do use Google Analytics with my blogger account.

But apparently, I still have some responsibilities to you, or at least, that is what Google is telling me.

Here's Google's policies and an explanation of how it uses your data, and how you can opt out of certain uses.

Now, here's what is interesting about what Google did.  They transferred their compliance risk to me even though I use their platform and their tools.  This is a classic technique to mitigate risk that we look at all the time in developing IHE profiles and other implementation guides.  It's one of several techniques for risk mitigation.  More often, we apply other technical mitigations (e.g., the use of ATNA or TLS, the requirement for authentication).

If you are reading this site in a European Economic Area, Google tells me that this site will display a notice to you.  I've not seen it yet because even though Google tells me I should be able to by using the or or similar name, my browser conveniently redirects me to the .com site.  So, if you got the notice, please let me know.  Otherwise, I may need to do something more.


P.S. Unlike HIPAA, GDPR doesn't really have an an easily acceptable pronunciation, although it does bring to mind several different acronym decodes.  In that, it's somewhat like BPPC.

Tuesday, May 29, 2018

On CMS 's Promoting Interoperability Program (the program formerly known as the EHR Incentive Program)

A while back I read through CMS's recent rule changing the EHR Incentive program to the Promoting Interoperability program.  I promised a blog update but for some reason (work), didn't get around to writing it.  I decided to take action and finally sit down and finish it.

The published rule is something like 1800 pages, 36 reams of paper when printed double-spaced (as preprints are).  You can find it at the link above.  It goes by the precise but lengthy title of:

Medicare Program; Hospital Inpatient Prospective Payment Systems for Acute Care Hospitals and the Long-Term Care Hospital Prospective Payment System and Proposed Policy Changes and Fiscal Year 2019 Rates; Proposed Quality Reporting Requirements for Specific Providers; Proposed Medicare and Medicaid Electronic Health Record (EHR) Incentive Programs (Promoting Interoperability Programs) Requirements for Eligible Hospitals, Critical Access Hospitals, and Eligible Professionals; Medicare Cost Reporting Requirements; and Physician Certification and Recertification of Claims

I will only cover two parts: Quality Reporting Requirements, and the Promoting Interoperability Programs.  The first part is interesting because everyone who cares about the second part also has to deal with the quality reporting part.  You can find my raw read-through comments on Twitter.

Quality Reporting Requirements

If you haven't been paying attention to CQL, you really need to be.  The first publication of reporting requirements technical specifications will be this spring (coming REAL soon) for use in 2019 (coming sooner than you think).  According to CMS (and me) "We believe that compared to CQL, QDM logic is more complex and difficult to compute".  CMS will be using a sub-regulatory process to make technical corrections to the measure specifications.  This is good because it means that the quality measure specifications will be able to have more quality without going through a heavyweight process to fix mistakes.  But it also means you have to pay attention.

Hospital measure data will join the myriad of other public data out there, as it will .  Soon we'll be able to compare not just meaningful users and their EHR systems, but also the quality results they'll be able to get from the use of them.  That should be an interesting mashup.  Hey ONC! Are you listening?

A lot of measures are going to be eliminated, either because they are topped out (hospitals are doing so well its not worth measuring any more), duplicative, or cost too much to produce for value received or ...  There's are lists of the measures that CMS is proposing to remove in the rule.  While everyone is happy about measures being removed, just remember that also means that there are fewer choices to succeed with...

Promoting Interoperability Programs

Why did they change the name?  Well, they remind us in the rule that the EHR Incentive part of the program is about over for anyone participating (we're now in the penalty stage).  So that part makes sense.  One significant change is that the singular program became the plural programs

They are planning to require 2015 Certified EHRs for these programs because, in part, ONC has confirmed that at least 66 percent of eligible clinicians and 90 percent of eligible hospitals and CAHs have 2015 Edition available (see the link above at meaningful users and their EHR systems).  Also, the evaluation period is a minimum of any continuous 90-day period within each of the calendar years 2019 and 2020, as you all probably hoped and expect.  The rationale for this change was that health care providers may need extra time to fully implement and test workflows with the 2015 Edition of CEHRT.


Security Risk analysis is simply required, you don't get any extra points for doing what you are required by law and regulation to already do.

For ONC and CMS, it's pretty routine to ensure that whatever the latest and greatest healthcare crisis is, there needs to be something to in the regulations about it.  So new measures have been added to address opioid abuse, including Queries of prescription drug monitoring programs (PDMPs), and verification of opioid treament.

Closing the referral loop also gets some love with a new quality measure supporting that buzz phrase.

A number of exclusions (loopholes) are being removed, few are using them or they aren't warranted according to CMS.

Puerto Rico hospitals become eligible for the program, something that wasn't available to them previously.

Some capabilities will no longer be required to be used (though they will still exist in the certification requirement): Secure Messaging, View/Download/Transmit.

So, there you have it, my summary of about 10% of the rule.

For what it's worth

The section on Future Directions is worth reading for those of you who are worried about what is next, but that is merely non-binding self-promotion for the most part.  The really important part related to that is where CMS asks you to tell them where they should go.

Thursday, April 19, 2018

Additional SMART on FHIR Authorizations

I sent something similar to this to the HL7 Security workgroup today:

This strikes me as an interesting use case for SMART or similar OAuth2 based authentication.

The use case is thus:

Through an API, a user wants to include certain kind of information into the record.  A given example might be an uncoded medication.  The provider organization wants to ensure that any uncoded medications have a SEPARATE provider authorization before entry, specific to the API call that attempted the request.  For example, in a system that I am familiar with, if using the EHR client, when an uncoded medication is attempted to be entered, the provider is requested to verify their intent (via a button click).  In another case, to “sign” a document, it requires them to re-enter their password information.

An authorization protocol that might work with a server and 3rd party applications wishing to adhere to these rules would be one in which:

API call which failed due to these sorts of rules would generate a user override authorization request token and pass it back to the caller in an extension to the OperationOutcome resource.  Such a token may also need to be communicated in an out-of-band manner to an authorization endpoint (but could also just be a signed JWT).  The code associated with the OperationOutcome might indicate that additional user authorization is required.

The application would call the authorization server with this token to request user authorization to complete the API request via a web-based UI that used redirection much in the same way that SMART on FHIR does today.  It might also include its current user's access token, although that seems unnecessary since the token supplied by the application would seem to be enough validation of the user.

The authorization server would interact with the user to request their authorization to complete the request.  On receipt of the user authorization (either by clicking yes, or reauthenticating, or whatever), it would provide an override token to the calling application.  Such token could then be passed in to a repeat of the API call.  The API call would find the same problem, but having been given the override token, would be able to accept the request.

Such a protocol might also include in the given override request token a signal with regard to the degree of user reauthentication/authorization necessary.  For example, many systems that have a physician “sign” a document, require the physician to reauthenticate to the application, whereas other override responses are simply a button click (e.g., uncoded medication).

An application that is aware of these special cases might also want to have a way to short circuit the API failure part of the request loop, and preemptively obtain provider authorization.

Your thoughts?  Is this a reasonable thing to consider?


Monday, April 16, 2018

If I'm baffled by this, how can patients possibly manage?

I have passwords I use about once a quarter for five different health accounts so that I can track what is going on with myself and my family.  Two Patient portals, one mail-order prescription company, my insurer, my HSA and God alone only knows what else.  I sit through five different sets of menu options whenever I have to call these people one the phone.

Sometimes I can access my children's data, sometimes I cannot.  Usually the latter unless my children have remembered to allow me access (which they willingly do), because one is an adult and the other is over 13.  I could probably have any person in my house claim to be them when I call one of these organizations AND FINALLY GET to a human, but always make sure to have them available.

In trying to follow up on a refill where I know the drug, the dose, the quantity, the ordering provider, and even the providers order number, I cannot get access to whether an order has been placed.  My children's healthcare providers are almost as baffled as I am by the systems they must use.

If trained and aware people are challenged by this system, how could the average patient cope?  How would a disabled patient have any ****ing chance of working through this?  This ISN'T about technology.  Technology could solve these problems.  It's about interpretations of policy that make it difficult for anyone to do anything even remotely difficult.  It's about appropriate use of technology that makes my life better, not yours.  It's NOT about paying attention to your stuff first instead of mine.

Does anyone EVER LIVE test tree menus on automated call systems?  I've had to enter account number and zip code twice, enter my phone number (even though I'm not blocking them from receiving that information), and then when I finally get to a human, repeat the same information, and then be routed to the right person to help me and repeat it AGAIN. Then I have to go through a standard set of three questions they ask on every call that have NOTHING to do with the reason I'm calling.  Yes, this is my address, no it has not changed, and yes you have my current information.  Where is the patient service here?  It's not in my immediate service to make sure you have up to date information, that is NOT my #1 reason for calling you today.  Can we solve my problem first before we make sure that everything else is ok?

My God, today, when I'm home sick and still have to deal with their mess, my patience for how patients are treated is 0, and I just want to )_*(&*(&%*!~ scream.

Payers and providers: here are some things that you can do to fix your systems:

  1. Make your call 911 message as short and direct as possible.  I understand and appreciate why it is there, but I don't call you when I should call 911, and having to wait for you to draw it out drives me crazy.
  2. Let me hit the button I know I need before you finish your first message.  It should be: If you are a current patient/member please press 1.  Get me in the door quickly to solving MY problem.
  3. Solve my problem first, then yours (e.g., has my contact info changed), and NOT the other way around.
  4. If I ask for an operator or hit 0, take me to an operator, not another menu.
  5. Yes, I know you have an online system, IT DIDN'T HELP ME, which is why I'm calling you.  Don't frustrate me further with that information until I'm on the path to queue I need, or at least let me get there quickly.  Making me wait through that message doesn't solve my problem.
  6. Make it easy for me and my family to allow me (or some other individual) to act on their behalf.
  7. Thank you for trying to protect my privacy.  Stop gloating about the great job you are doing and help me solve my problem.  Another message I want to hear AFTER I'm in the right queue and have to wait for the next available operator.


I swear I want one of those answering machines I can program myself, just to make you feel my pain when you call me.

Thank you for calling the Boone residence.  If you are a family friend you have called the wrong unpublished number.  Please hang up and call the other number we gave you, or text the person you wish to reach.  If you are a member of the police or fire or other public safety department, please press 1.  If you are a vendor with whom we have an account please press 2. If you are a vendor or charitable society that we have not contacted, please hang up and do not call again.  This phone is on a do not call list, and repeated attempts to call this number will result in criminal prosecution. 3. If you are a healthcare provider, a health insurer, a pharmacy or PBM program with whom a member of this household has an affiliation please press 8 now.  *8*  If you need immediate ...

Thank you for contact the Boone family.  If you are a healthcare provider or pharmacy please press 1.  If you are a payer or PBM please ... *1*

Please enter the extension of the person you wish to reach.  If you would like a dial by name directory, please press 1.  Or press 0 for immediate... *0*

Then we'll ask for you name, phone number and 10 digit NPI number, and if you don't have it immediately, we'll wait a few minutes and then ask you to hang up and call us back when you have it.

How will that work for you?

And to follow up, today, I had to call my homeowners.  They knew who I was BEFORE I told them.  Their claims specialist already had my policy available because they KNEW who was calling them.  And then contacts a water damage specialist for me who called me withing 15 minutes for a same day appointment.  Thank you Liberty Mutual for getting it COMPLETELY right.

Sunday, April 8, 2018

In which I finally figure out dates and timestamps in FHIR

The key is in the last sentence of my last post (half the reason I do these things is to think out loud).  Two DIFFERENT things.  A timestamp is a date and a timestamp.  A date is just a date.  When you compare a date to a timestamp, you are comparing ONLY the date aspect.  When comparing times, you are getting into the timestamp aspect.

User expectations met, problem resolved.  I may have to think about saving the date only representation of a timestamp for efficiency reasons.


Right, Wrong and Right again about TimeZones in FHIR

So I am back to dates again.  I forgot to check in the change for setting the servers default time zone, and almost all my unit tests failed on a build machine located outside of my home zone.  Dang it.  I KNEW that would happen.

So I made the fix (following the advice of reasonably smart people), and set the server's default time zone to UTC.  And my date tests are still failing.

Here's the failure:  There's a record that updated on 2017-06-26T17:51:45.233-07:00 (the source of that record is on the left coast, it doesn't matter where the server is).

And I query for it by asking for records updated on that date using _lastUpdated=2017-06-26.

What should happen?

Well, everyone I asked who answered without much thought say that these should match. But they don't.

Because 2017-06-26T17:51:45.233-07:00 is 2017-06-27T00:51:45.233Z in UTC.  And clearly 2017-06-26 is not the same date once you transform the timestamp to UTC, and therefore _lastUpdated=2017-06-26 fails.

The principle of least surprise should apply, and if we use a fixed time zone for the server, it surely doesn't.  Especially a case where the server time zone isn't one my users care about.

So, the conclusion I've come to is, when comparing two dates:

IF both specify precision, the comparison proceeds based on the time as reflected by each date, as it was specified.  There's no problem here.  If neither specifies a time zone, any time zone will do, as long as it is the same for both.

Here's the tricky bit: if A does not specify a precision but B does, the comparison should be based on a common time frame, either A specified a TimeZone and B didn't, or A didn't and B did.  In the presence of a time zone in A or B but NOT both, they should be compared using the same zone.  Otherwise, someone is going to get an unexpected result.

It's that odd duck case where the user asked for a particular date, the server has a specific date and time zone specified for the date (which may be different from where the server itself is located), and boom.

Otherwise, imagine what could happen.  Any test without a time zone is by necessity, imprecise.  I think it's better to specify what should happen in these cases quite specifically, so that time based comparisons don't surprise anyone.

Date has a legal function, time stamps a technical one.  Comparing a DateTime might be used for either case.  So, in my view, even though it seems weird, one has to consider that a DateTime comparison WILL behave differently when using 2017-06-27T00:00+00:00 as compared to 2017-06-27.  The questions are different.  The first is about time, the second about a date, and the two are DIFFERENT but related things.

Ah well.  Another day goes by.


P.S. A similar problem will show up when comparing durations. Did it happen in the last two days or in the last 48 hours are two DIFFERENT questions, just as when I speak to my colleagues in India yesterday, today and tomorrow mean different things to each of us, depending on where we are in the world.  Let's not even talk about business days ...

Friday, April 6, 2018

The FHIRWalker Interface

The DOM has the TreeWalker interface, I would suggest that FHIR would benefit from a FhirWalker interface which would allow a resource, datatype or component to be walked in a traversal and processed in certain ways.  Definition of such an interface would benefit many.  Already such interfaces about in implementation code, e.g., in HAPI there are multiple parsers and readers that do the very thing that my FhirWalker interface would support.

Two such walkers would be valuable, one which walked a structure definition for a prospective resource, and the other which walked an actual resource.

Such walkers could be used for reporting, query implementation, et cetera.

I've personally written three, four, (I've now lost count) of these for various purposes.  I got fed up with writing one more and wrote a Walker class for myself.  It looks something like this (though I must admit to some artistic license in the names of things.

interface FhirWalker() {

   // Step describes the interface of the methods that can be
   // performed with the Victim of the FhirWalk.  It's not really
   // used in any way by walk or walkBackwards, but might find 
   // some use in implementations.
   interface Step implements BiFunction<String, Any, Boolean>;

   // Walk defines the interface for a traversal. Like 
   interface Walk implements BiConsumer<Any, Victim>;

   interface Victim {
       // tremble is called before walking on any coal.
       // if tremble returns true, the processor chickened out.
       boolean tremble(String path, Any coal) throws FhirWalkerTrippedException;

       // stumble is called on anything that caused a problem.
       // think of it an an exception handler.  If it returns
       // true, the victim recovered.  Otherwise, the victim 
       // fell down and help needs to be called (an exception thrown)
       // immediately.
       boolean stumble(String path, Any coal) throws FhirWalkerTrippedException;

       // burned is called after walking on a coal.
       boolean burned(String path, Any coalthrows FhirWalkerTrippedException;

   public class FhirWalkerTrippedException {
       public FhirWalkerTrippedException(
           String exclamation, Any coalStumbledUpon, Throwable excuse

   // Walk the FHIR tree in order
   public void walk(Any startingCoal, Victim victim) 
      throws FhirWalkerTrippedException;

   // Walk the FHIR tree in reverse order
   public void walkBackwards(Any startingCoal, Victim victim)
      throws FhirWalkerTrippedException;

   public void walkCanonically(Any startingCoal, Victim victim)
      throws FhirWalkerTrippedException;

The walk() method implements a NLRN traversal, calling tremble before processing a node, then processing the children left to right, then calling burned after processing the children.  The walkBackwards() method implements an NRLN traversal, calling tremble first, and burned last. 

Tremble and burned have the same signature even though the return on burned is ignored (once burned it twice shy?)  The reason for that is so that a function used for burned in one place can be
used for tremble in a different traversal.

For what it is worth, something implementing the FhirWalker interface need not traverse an entire tree of a FHIR Resource.  It could implement a traversal of selected subnodes.  For example, a DataTypeWalker might only call tremble and burned on FHIR Data types found, a PrimitiveWalker only on primitives.  A whole host of walkers could be created with great benefit.

Somewhere in the back of my head is a BlindfoldedWalker but I don't know where that's going.

Anyway, enough amusement for the day.  I have my own walker now, time to go for a stroll.


Tuesday, April 3, 2018

FHIR date Equality in a Global Environment

Under what circumstances can someone ask a question that can mean one thing in one part of the country and yet something entirely different in another?  When what they are talking about is time.  Just ask your colleague from the left or right coast what time it is, and you'll see they come up with a different answer than you do.

Here's the challenge:  A FHIR Server hosts a number of resources, and each of them have an associated timestamp in Resource.meta.lastUpdated.  So what should you get when you ask for resources that have been updated recently?  Well, frankly it depends upon how you ask.

If you give a date only with no time zone, then the answer you SHOULD get back is the based on the date you supply, as interpreted in the context of the resource that has the timestamp.

Consider: Resource A has timestamp given in terms of Eastern Standard Time.  Resource B has the same timestamp, but using Pacific Standard Time.  Resource C has the same timestamp, but using India Standard Time.  If C was updated today, is it also true that A and B were?  Not for certain, at least according to my thinking.

The answer is, it depends.  First of all, it cannot matter to the server where you are if you don't tell it, so if you give a query based on a date (and thus without a timestamp), it will have to use what it knows, which is whatever timezone it uses for local reference.  Some servers may set local reference time to UTC, others while most would use the local time zone, yet others may have standardized on the local time at wherever headquarters happens to be.  Even so, the variations don't matter, what matters is if you say nothing, the server has to interpret the value. If you do happen to tell the server where you are (e.g., by giving a timestamp with a timezone -- and thus hours and minutes at the very least), then it should interpret time according to what you told it.  So far, so good.

Next up.  But what about what the resource say?  Why does this matter you might ask.  Well, the server and the resource may not agree on what should be in the timezone of a timestamp.  In fact, if your server is just a raw FHIR repository of data, with some applications storing data, and others reading it, then it is the application which decides the timezone associated with the timestamp.  So, if you are with me so far:

Resource A was created a T in EST, B at T - 3 hours measured in PST, and C at T + 10:30 hours.  A time span measuring 13 and 1/2 hours.  More than half the time of the day, the crossover from one day to the next will be between resource B and Resource C's location.  And Resource A will be on one side or the other (it can get worse: If you have your resources on Baker Island, New Zealand and the Line Islands, where for a couple of hours, each could be stamped with a different day).  So NOW what?

If comparing by date alone, the resource date could actually be the one to rule.  If comparing my day to the resource's day, where we disagree on what the time zone is, how is a body (or a server) to compare the two.  We could agree to use the servers time as the point of arbitrage, and the problem would be solved. 

Or would it? How do you commit baseline test results for your server codes unit tests when you have developers in all three zones running them on a local server?

**** if I know anything other than a completely arbitrary answer.  If you have a better one, I'd love to hear it.


Monday, April 2, 2018

:not in the presence of multiple values in FHIR

When a resource can have multiple values of a particular type (e.g., identifier, code, _tag, _security), one of the questions that has come up for me was the interpretation of :not.

This is made more challenging by virtue of the fact that different implementations handle it differently.

From a use case perspective, which is more interesting:

Given that patients often have multiple identifiers would you rather than:


Return a patient that did not have any identifier=999999999, or would you rather it returned a patient if any one of their identifiers was not 999999999?  Frankly, I want the former.

Similarly for code, if Condition has an ICD-10 and SNOMED code, would you ask for Conditions that are not SNOMED code for Heart Attack to include all conditions that had any other code, or that ensured NO value of Code was the SNOMED code for heart attack.  I want the latter.

This is the rationale behind a recent clarification in FHIR STU4, coming hot and heavy off the presses (see GET [base]/Composition?section:not=48765-2).

I'm all in favor of this, although I must admit to having had some struggles interpreting it.  That's what ballots are for though.


Wednesday, March 28, 2018

FHIR workflow is much Better now

Working in an environment where you have many communities collaborating on a project, it is sometimes difficult to ensure that there's overall coordination among all of the moving parts.  A particular example that comes to mind in regard to FHIR was the degree of variation among the
various workflow resources:

In DSTU 2 we had these key ones (I include Procedure because it is a record of what had been done which shares many of the characteristics as the record of what was asked to be done).
  • DiagnosticOrder
  • ProcedureRequest
  • Procedure
  • ReferralRequest

Some of my findings:
The code associated with workflow could be named type or code or item.code
Date of the order had different names event-date (with a specific code identifying which date was the order placement date), data, or orderedOn.
The references to the placer of the request and fulfiller of the request were identified in different ways.

These are the who, what, and when of the workflow for things which FHIR acknowledged were somewhat arbitrary distinctions (the difference between a referral and a procedure and a diagnostic test) worked differently depending on which one you needed to use, EVEN though, you could never actually be sure which was the correct one.

STU3 addressed most of these issues, and the FHIR STANDARD (because that's going to happen soon) will do even more.  The good news is that the right governance was in place to address this issue as we tested this out, and corrections were applied.

As for me, I'm seriously considering how to adopt some search aliases for current STU2 based APIs to ensure that the APIs will do what people meant them to.


Wednesday, February 28, 2018

Hello CQL

So you want to learn CQL.  So do I, so I thought I'd probably write a book about it ;-)

Somewhere in the book will need to be the CQL Hello World program, which I'll repeat below for the uninitiated:

define Result: 'Hello World!'

CQL doesn't have assignment statements.  You define things and having defined them, you can later refer to them.  But that's it.  Values are never changed by the program.

That's an essential feature of declarative programming.

By being side effect free, CQL programs can be implemented by an executor in whatever order makes the most sense to optimize performance.  Another commonly used language that works this way is XSLT, which might explain why I like CQL. 

CQL has four primitive data types: Boolean, Integer, Decimal and String, along with the not quite primitive DateTime and Time types.  Boolean uses the traditional true and false values.
It also has complex data types including Quantity, Code, Concept, ValueSet and CodeSystem. 
Beyond that, everything else is either a complex class referencing an information model, or is defined in a Tuple.  And then there is null, which isn't a data type.

Strings are sequences of characters wrapped in single quotes.  Special characters are escaped using \ as in the C and Java language families with all the common escapes and Unicode.

Double quotes are reserved for named identifiers associated with complex things (Code, Concept, ValueSet, and CodeSystem).

Math is math.  Logic is three-valued.  Time is complicated, but less so in CQL than anything else.  CQL moves time from being a great big ball of timey-wimey stuff into linear progression that allows non-time lords to express logic within it.

One of the chapters will have to be about the history of CQL.  In "A theory of everything" written in 2013 I quickly listed some of that history.  Later history includes FHIR, QUICK, QICore, and some other bits and bobbles.  The meeting described in that post reads to me much like the begats in the Bible, and CQL may in fact be the messiah for CDS.  But right now it probably still has to spend its 40 days (or is it weeks, hopefully not months or years) in the wilderness.

Five years.  This is probably the second time in my life where I sat down and looked at a piece of health IT history and went oh shit.  Was it really that long ago?

Anyway, I probably am going to write that book, but don't expect it soon. I still have a lot to learn.


Tuesday, February 27, 2018

Logic in the Presence of Unknowns Just Isn't

... logical?  ... executing? ... or as in my case, even vaguely working to my expectations.

CQL today has to work in the presence of unknown values.  We call these nulls.  Null has this weird property of taking over everything in tri-value oriented languages (those where null is expected), and blowing up everything in bi-value oriented languages (those where null is not so much expected).

How can you tell if your language is oriented towards tri-valuedlogic, or bi-valued logic?  Well, the simple answer is what happens when you compare null OR true.  If the answer is that an exception is thrown, you are definitely dealing with bi-valued, and if you get true, then you are dealing with tri-valued, and if you get null, someone screwed up.

So what happens when you try to build a language interpreter for a tri-valued logic system (like say CQL) in a language that is generally bi-valued (say Java).  Some problems around null values.  In the real world, null is a thing.  It happens.  People don't fill out all the fields in a form, some values are simply unknown, or dependent on workflow that hasn't happened yet.  But we still have to reason with it.

Here are some interesting things you need to think about:
When you sort a list of objects based on a field, where do the objects go where the field is null?  XSLT got this right by making a decision, even if you don't like it.  So the behavior is defined.
"The set of sort key values (after any conversion) is first divided into two categories: empty values, and ordinary values. The empty sort key values represent those items where the sort key value is an empty sequence. These values are considered for sorting purposes to be equal to each other, but less than any other value. The remaining values are classified as ordinary values."
CQL doesn't actually cover this case.  Here's what it has to say about sorting:
"After the iterative clauses are executed for each element of the query source, the sort clause, if present, specifies a sort order for the final output. This step simply involves sorting the output of the iterative steps by the conditions defined in the sort clause. This may involve sorting by a particular element of the result tuples, or it may simply involve sorting the resulting list by the defined comparison for the data type (for example, if the result of the query is simply a list of integers)."
So NOW what?  Well, I think a minor adjustment much like what XSL had to say is in order here.

Type conversion is another issue. If you have a defined process for converting from one type to another, then you should also have a defined process for converting null things that might have been a basic data type into other null things that could be a different data type.  For example, the null string should convert to the null date.

Taking type conversion a step further, the string "asdf192832340asdfa8" when converted to a date might in fact return a null value to indicate their is no conversion.  Or it could raise an error.  That's a decision that needs deciding.

What happens when you union or intersect two lists where the list itself is null?  At the very least the behavior needs to be defined.  To see where the problem lies, consider the following:

List<String> l = null;

Is l an empty list, or simply null?  People who build collections are in the habit of returning an empty collection rather than null, but sometimes the collection builder itself returns null because it perhaps doesn't even understand the type of null at execution time.  That's actually OK, just return Collections.EMPTY_LIST (which happens to be pretty much identical to Collections.EMPTY_SET).

Life gets dicey around nulls.  There are no easy rules, you have to think about it.

BTW: This isn't a dissing CQL. I quite like the language.  But then again, I've been known to write tremendous volumes of code in XSL as well, so that isn't necessarily great praise from someone sane ;-).  I'm simply reporting on some of the challenges I'm having in the hopes that they can be fixed, and that others trying to use it can watch out for the hidden (you might even say unknown) pitfalls that are still being worked out.


Friday, February 23, 2018

How workflow can affect data and reasoning

I've been playing around a bit with the Clinical Query Language lately.  One of the interesting challenges I had to solve was to deal with some logic defined with one particular data representation in mind with a different representation.

To simplify the problem, I'll look at something that's pretty typical.
Consider the patient history form, a section commonly appearing on the "Clipboard" given to new patients:

Has anyone in your family ever had:
Cancer              [ ?]
Hypertension    [   ]
Stroke               [X]

You'll note here that the patient might not use the form the way it was intended due to uncertaintly about one of the answers. So perhaps it might later be changed to:

Has anyone in your family ever had:
Cancer               Yes   No  Unknown
Hypertension     Yes   No  Unknown
Stroke                Yes   No  Unknown

When encoding this information, there are number of ways to store it in the EHR system. If using precoordinated terms, you can simply list SNOMED CT expressions for all the positive items. This is one way to encode the information. However, pre-coordination of all possible cases doesn't exist in any singular vocabulary.  You cannot say in a single SNOMED CT term that you don't know if the patient has a family history of cancer, but it can be stated in a post-coordinated SNOMED CT expression. So this kind of result is often captured in question/answer form.

There are at least three additional ways to codify this information in question/answer form:
  1. You can codify the overall question, and give a list of codified answers.
    Q1: Family History? A: Hypertension
    This tells you nothing about cancer or stroke.
  2. You can codify each individual question, and list the answers as a yes/no for each checked / unchecked box respectively.
    Q1: Family History of Cancer? A: No
    Q2: Family History of Hypertension? A: No
    Q3: Family History of Stroke? A: Yes
    This doesn't capture the uncertainty about cancer.
  3. You can codify each individual question, and list the answers as a yes/no or unknown.
    Q1: Family History of Cancer? A: Unknown
    Q2: Family History of Hypertension? A: No
    Q3: Family History of Stroke? A: Yes
    This captures the fine detail across the board.

For an application to be able to reason with the data, you have to consider the various ways in which the question could be asked, and how to detect the appropriate response.

The challenge with clinical decision support and quality measurement is then to determine how to map the questions you have answers to into the questions the decision support is asking.  Sometimes, there isn't a clean match (as for cases 1 and possibly 2 above, as well as coding using precoordinated terms).

In CQL, you can easily map codes used to answer questions into a particular form, making it rather easy to change the code systems.

Code systems and codes are specified symbolically, as in:

codesystem "SNOMEDCT": ''
code "FH of Cancer": '275937001' from "SNOMEDCT" 

To change from from SNOMEDCT to ICD-10, you might use:
codesystem "ICD10": ''
code "FH of Cancer":  'Z80.9' from "ICD10" 

You could also create a single value set containing both codes.  Presently, CQL does not have a way to define a value set, only to reference them (they are defined elsewhere).  
valueset "FH of Cancer": 'some OID or URL reference'

You logic would then only need to address one thing: "FH of Cancer", either as a singular code, or a value set.  

You might cheat here and use a Concept as a way to create a faux valueset across different terminologies, but this is actually discouraged in CQL.  It might be better to create a CQL list of codes and use the contains() expression to determine if the code you have found is in the list of codes.

This doesn't get at negation/unknown logic that might also be needed in formulations 2 and 3 above.  To do that, you can define a function that checks for varying formats, and might also use various value sets for "Yes"/"No"/"Unknown" as possible variations.

When you get right down to it though, the decision that determines how reasoning needs to be done is often taken long before the reasoning is ever implemented.  And you cannot expect that to change quickly, because simply changing the form from the first example to the second can take quite a bit of time in a provider organization.

CQL goes a long way towards making clinical decision and quality measure logic reusable and mappable to provider workflows, but it is still missing a few pieces to make it truly easy to separate logic from data.