Sunday, September 9, 2012

Successes at the HL7 FHIR Connectathon: Playing with FHIR

So, I came to the FHIR Connectathon this week empty-handed and with an open mind.  I hadn't even had time to read the for-comment-only ballot because of Meaningful Use.  But if FHIR is as easy as I said it might be, I thought I might be able cobble something useful together while I was there.  Just about all of the tools I've written over the years however, have something to do with CDA, V2 messages, or IHE Profiles.  So if I wanted to take advantage of what I've already done, I needed to do something that used some of those.  I decided this morning that I'd develop a PIX/PDQ to FHIR Gateway for Person resources.

PIX and PDQ messages are very simple.  They express a couples of query parameters in a V2 message, and expect back a V2 message that contains the appropriate patient records.

I already had tools to create test PIX/PDQ query messages, execute the queries against a PIX/PDQ Server, and to return results in a web page. I also had a quick and direct (and relatively dumb) transformation of V2 to XML (not the standard version, but close enough).  And I dream in XSLT, so it should be pretty easy to:

Here was my plan:
1.  Create a listener for PIX/PDQ Messages
2.  Turn them into XML
3.  Extract the query parameters into an appropriate Query Structure
4.  Issue the Query to a FHIR Server (RESTfully)
5.  Translate the XML Response into an HL7 V2 Message (using XSLT)

I could then use my PIX/PDQ Test Harness to create and send the PIX/PDQ messages to my local server, and then gateway that to FHIR, creating a PIX/PDQ Server.

If I have time, I could also go the other way, turning certain FHIR Queries into something that a PIX/PDQ Server could respond to.  Since I'm not sitting on a PIX/PDQ Server yet, I'd just reuse the one I just created.

Here's how it turned out in detail (or you can cut to the chase):

Times below are approximations.  I know exactly when I started and stopped.  Line counts are nearest half-decimal approximations from real counts.  If I say 10 lines, it could have been anywhere between 7 or 12, but I did count them.

8:30 AM.
I spent about an hour reconfiguring my development environment and finding the code resources that I wanted to use.  If I were an average developer, I'd already have these in place.  But I develop in brief spurts with long lapses in between.  And I hadn't needed to run a web service on this laptop since it had been reloaded (shortly after HIMSS).

9:30 AM
I read the three key pages of FHIR Specifications that I needed to.  Wrote about a page of design notes which became the start of this post.  I tweeted, chatted, and handed out more silly ribbons.  That took about 15 minutes.    Knowing what to look for was important, and here, having been through the brief FHIR tutorial at the last working group meeting helped, but wasn't essential.  The entire spec is less than 150 printed pages.  Here are the key pages:
  1. Person Resource
  2. Searching
  3. HumanID (one of the data types)
    9:45 AM
    I built the server listener.  I've written so many socket listeners that it was just copy and paste from another project.  That took 10 minutes to find.  It was so simple, I had to go look it up again to check that I'd done it right.  I had.  This took another 15 minutes, most of that time was spent verifying that I had the right code.  Here's my basic template for that sort of thing (and I put it here so I can find it again later):

    public class Server extends Thread{ private Socket clientSocket;
      String server = "localhost";
      static int port = 3600;

      Server(Socket s) { clientSocket = s; }

      public static void main(String args[]) throws IOException
      { if (args.length > 0)
          port = Integer.parseInt(args[0]);
      private static void runServer()
      { try {
          ServerSocket serverSocket = new ServerSocket(port);
          while (true)
            new Server(serverSocket.accept()).start();
        } catch (IOException e) {
      public void run()
      { try {
          InputStream is = clientSocket.getInputStream();
          OutputStream os = clientSocket.getOutputStream();
          process(is, os);
        catch (Exception ex) { ex.printStackTrace();}
        { try { clientSocket.close(); }
          catch (Exception ioex) { ioex.printStackTrace(); }
    Next step was to build the all important process() function.  It does a few things:
    1. 10:10 AM First check to see that my listener works.  It did on the first try.  Time to chat, and unload and reload on coffee.
    2. 10:25 AM  Transform the HL7 MLLP Message into an XML structure.  That's about one line of code.  Verifying that it worked took a bit of time because I was missing a chunk of output.  It turns out my code to write the XML tree closed the stream it was writing to, which happened to be my console output. So the next output went to a closed stream. I fixed that.  It's a bad idea for a function to close a stream it doesn't open (but a good idea to flush).  What should have been a couple of minutes took 15 to debug. 
    3. 10:40 AM I refactored my little HL7 XML utility package to split up its one former function (reading V2 from a stream and writing to an XML file), into a couple of parts.  So, now it could read from a stream and create a DOM tree, write the tree to an XML file. (10 new lines of code/5 minutes).
    4. 10:45 AM I added a function that could grab any field, segment or component in the XML tree by it's element name.  (20 new lines of code/10 minutes).
    5. 10:55 AM I used that to extract the three key fields from the message like so.  That, plus chit-chat and a few more tweets took maybe 5 minutes:
      String sourceNamespace = x.getPart("QPD.3.4.2"),
             targetNamespace = x.getPart("QPD.4.4.2"),
             identifier = x.getPart("QPD.3.1");

    6. 11:00 AM Spend another 10 minutes figuring out what URL I needed, refresh myself on HttpURLConnection and URL Encoding, chat, tweet, get more coffee.
    7. Somewhere in all of this, we found an issue with search parameters for Person.  There were no search parameters for DOB.  Now this isn't an issue for PIX, but it clearly is for PDQ.  I explained my use case for this to Grahame.  We want to match patients with this name, DOB and Gender.  He agreed that it should be in his code, but for some reason, we found that it wasn't in the spec.  So he added it.  Then I complained that I didn't want dob-before and dob-after, simply dob.  He added that to.  That probably took a half hour give or take.  That brings us up to 11:40.  Add another twenty minutes for chit-chat and lunch
    The next half was actually performing the query on Grahame's server.  I skipped lunch and kept on coding.  I talked to Grahame about searching on identifier, and we found a second problem in the FHIR spec.  The issue is that you can search on identifier using "root" and "extension" (in V3 II terms, in FHIR, this is system and id) or just on extension.  The search parameter is what FHIR calls a qtoken.  A qtoken is a string in the form system:string.  But system is always a URI, so you wind up with something like: oid:, and that's no good, because : is a special character in some URI's such as urn:org-hl7:v3, and so you never know what part is really the system.  So we agreed to use the hash (#) character as a seperater.  So now the identifier query parameter looks like this: oid:

    Except that now, I'm not able to get the right results back.  Grahame discovers that he hasn't implemented the full identifier search and does so.  I test again, and it doesn't work.  Whoops, now he has to rebuild his index again so that it will.

    And now I try to craft the URL and hit it from a browser, and I run into a URL encoding problem.  You see, : and # are special characters in URLs.  So they have to be URL encoded.  But when I type them in, it's not working.  Well, it turns out, that's because Chrome (like IE) cleans up malformed URLs that are hand entered, automagically encoding them for us stupid humans.  Except that it didn't do what I wanted to it THIS time, which was to URLEncode both the : and the #.  But no, it treated the # as the fragment identifier delimiter, but did encode the :.  And so Grahame got a query he doesn't support (because a query on all system parts of an identifier could be huge).  Once I figured that out, I changed my query parameter to the correctly encoded string: oid%3A1. and it worked.

    But I was getting back just the HTML in the atom feed entries.  Well, that's no damn good.  So, too the rescue comes Grahame, pointing out that if I add &format=xml (or &format=text/xml) to my query, it will give me what I want.  Great, I hand enter the URL, and it worked.  Now I code it.  I hard code Grahame service URL, and add "?format=xml&identifer=" + URLEncoder.encode("oid:" + targetNamespace + "#" + identifier") to the end of it.

    I get back a Microsoft SQL Server violation screen, reporting that "and" was unexpected.  So Grahame goes digging and finds a bug in the way he parses the string.  It turns out that putting format at the beginning of the query parameters produces a SQL query that contains two ANDs in a row.  He fixes that.  Meanwhile, I move &format=xml to the end of the string.  And mess up, URL encoding that, which is bad.  So I have to figure that out, and send Grahame up a blind ally.  Then I fix my error.

    By this time, lunch has been over for a while, and people are coming back in the room.  So, call it about 1:30 PM now.  I've managed to add three lines of code to the process() function in 90 minutes.  Not very effective.

    So, I now have two things:
    The inbound HL7 message in XML format, and an XML result containing the data that I need.  I need to take both of these items, and turn it back into a single XML document containing the response that is expected by my PIX Client.

    XLST to the rescue.  But first I need to find a good PIX Response example.  I dig out the ITI TF:2a (where the PIX message is defined), and there are none there.  I look in TF:2x (appendixes), and find none there.  I check the implementation materials on the IHE Web site, and find none there.  I Google PIX IHE Example and win.  I spend the next 10 minutes on the Google Doodle honoring the 46th anniversary of Star Trek.

    Call it 2:00 at this point.  I write an XLST to convert the inbound PIX message to an outbound response, swapping sender and receiver in the MSH Segment, copying some stuff from QPD to QAK, and supplying parameters for time-stamp and message-id.  That's the top part of the PIX response.

    Writing and testing that probably took another 30 minutes, with lots of distractions.  At this point, most folks in the room are talking about what they've done, demoing code, and I spend a lot of time paying attending to two things at once, sometimes three.  Somewhere in all of this, my wife hip dials me from the Back-to-school Jamboree she's running for the district, and I run out to take that call.  Last night there had been a minor event requiring several stitches to my youngest, and all we had was text messaging for communication.  Not the best way to communicate from the ED (she's OK, BTW.  No permanent damage).  So, now I needed to make sure she had't meant to call me about some other minor emergency.  Anyway, trying to call someone back who's hip-dialed you, when their phone cannot be heard in a loud place, it took a bit.

    So, now I have to take the FHIR Person Resource XML, and turn that into the PID segment.  Mike Henderson and I had some conversation in the earlier part of the day about the fact, that the FHIR Server could return several person records that matched my identifier query.  He and I agreed that it was probably the responsibility of the server to ensure that didn't happen, although Grahame later indicated that I was probably passing the buck.  Another tweet.  And cookie break time.

    So, anyway, with all the other distractions going on, it probably took me right up until 4:30 until I had that done.  I listened with half an ear to other folks, and managed to string together the two pieces of XML into one.  At which time it was my turn to show what I'd done.

    I spent maybe 5 minutes showing the testing page, the XML inbound message, my URL Request, and the XML outbound message that would be going back to my page.  I explained that I hadn't been able to finish the part that would convert my V2-like XML into an ER7 message, but that if I had, my test page would have shown the results.

    The Chase

    Basically, in a day, with a lot of distractions, I cobbled together a PIX to FHIR Gateway for the V2 PIX Query message, and was able to test it with one system.  I didn't put in the filtering code that would deal with optional parameters, but I had the basic functionality.  This isn't production code by any stretch.  At the very best, you'd call this proof-of-concept.  The end of research and the beginning of development.  I didn't complete my goals totally, nor did I get to my stretch goals.  But, I have something that I can use at the next one of these events (which is in the planning stages now).

    The FHIR Connectathon was a great success.  It's chief accomplishment was in identifying key issues that the FHIR specification (still in draft form) needs to address.  It was like an IHE Connectathon, and that's fine.  We can grow into it.  We learned quite a bit about FHIR just by having this event, and I hope they become routine for every HL7 specification.

    Is FHIR Living Up to It's Promise

    Hell yes.  Honestly, I spend more time dinking around with V2 stuff than I did with FHIR.  The question that PIX answers in 50 pages of specification and appendices can be resolved with a profile of FHIR easily in half or less that space.  If I can write it in a day, a fresh-out-of-college CS Major can handle it in a week, and toughen up for production in a month.  Some of you could probably do it in your sleep.

    What FHIR isn't doing yet is describing how to build higher level services from the resources it is defining.  There will still be a place for a PIX-like service definition.  We'll have to work on what that looks like next.  And that is another reason why I also say that FHIR is living up to its promise, because we can already see where we need to go with it after the resources are defined.


    1. A 7 minute video showing the FHIR connectathon in action, and with interviews with some of the participants can be found here:

    2. Interesting. But you don't need to include the format in the query string. REST purists would rather the caller specify the desired format in the Accept HTTP header of the request.

    3. I agree that Accept Header is best, and in fact, I put it in both places. The need for format in the query string is because NOT all clients give you control over the Accept header. When testing in browser, I need it, and when accessing a RESTful resource via XSLT document() function, I also need to have the query parameter.

    4. For in-browser testing I would highly recommend a lightweight REST client like Postman for Chrome