Saturday, July 13, 2019

Optimizing Inter-Microservice Communications using GZip Compression in HAPI on FHIR

It's widely known that XML and JSON both compress really well.  It's also pretty widely known that one should enable GZip compression on Server responses to improve server performance.  Not quite as widely known , you can also compress content being sent to the server (for POST or PUT requests).  Also, most people can tell you: JSON is smaller that XML.

And size matters when communicating over a network.

So it should be obvious that one should always GZip compress data whenever possible when connecting between two servers, right?

Uhmm, not so much, but you could already see that coming, because what would I write a blog post about if if were true.

Here's the thing.  Compress saves time for two reasons:

  1. It takes less time to transmit less data.
  2. There's less packet overhead with less data.
But it also takes CPU time to compress the data.  So long as the CPU time taken to compress the data on one size, and uncompress it on the other side, is LESS than the savings in transmission and packet overhead, it's a net win for performance.  

Let's look at data transmission:

A maximum transmission unit (MTU) is about 1400 bytes.   This takes a certain amount of time to transmit over the network.  Here are some values based on different networking speeds:
Bandwidth
(Mbps)
Time
(ms)
2.24 
10 1.12 
20 0.56 
100  0.112 
200 0.056 
300 0.037 
1000 0.012 

Depending on network speeds, time saving on sending a single packet can save anywhere from  12 µs to 2.2ms.  This isn't very much, but if you have to send more than one packet, then you have interpacket latency, which is basically dealing with round-trip times from client to server for acknowledgements.  ACKs don't need to be immediate in TCP, a certain number of ACKs can be outstanding at once, there's not latency introduced on every packet) sent.  But your network latency also has an impact (network latency is generally measured on the order of 10s of ms) on the throughput.

I ran an experiment to see which method was fastest when sending data in a POST/PUT Request, using GZip or not using GZip, and the results were interesting.  I send 200 create requests in which I controlled for the size of the resource being sent in terms of the number of packets it would required to be sent over, from 1 to 10 packets of data (where I mean packet, the size of a single TCP segment transmission, controlled by maximum MTU size).  I sent the request in two different formats (XML and JSON), over three different networks.

For a control network, I used localhost, which actually involves no transmission time or effective latency.  Also also did the transmission over my local network, so that it actually went from my system, to my router, and then back to my system.  And then finally, I transmitted from my system, to my external IP address (so it left the router, went to my cable model and came back through it).

I burned the first batch of 200 requests to initialize the code through the JIT compiler.

Here's what I found out:

  1. Don't bother compressing on localhost, you are just wasting about 2ms of compute on a fast machine. 
  2. Don't bother compressing within your local network (i.e., to a switch and back).  Again, about 2ms loss in compute on a fast machine.
  3. Going across a network boundary, compress JSON after 3 packets, and XML always*.
  4. Use JSON rather than XML if you are using a HAPI server.  JSON is ALWAYS faster for the same content.  For smaller resources, the savings is about 20%, which is fairly significant.

What does this mean for your microservices running in your cloud cluster?  If they are talking to each other over a fast network in the same cluster (e.g., running on the same VM, or within the same zone with a fast network), compression isn't warranted.  If they are communicating across regions (or perhaps even different zones within the same region), then it might be worth it if your content is > 4.5K, but otherwise not.  A single resource will generally fit within that range, so generally, if what you are compressing is a single resource, you probably don't need to do it.

It won't hurt, you'll lose a very little bit of performance (less than 5% for a single request if it doesn't take much work), and much less if you do something like a database store or something like that [all my work was just dumping the resource to a hash table].

That very limited savings you get for turning outbound compression on in the client when making an interservice request is swapping compute time (what you pay for) for network time (which is generally free within your cloud), and saves you precious little in performance of a single transaction.  So any savings you get actually comes at a financial cost, and provides very little performance benefit.  Should your cloud service be spending money compressing results? Or delivering customer functionality?

Remember also when you compress, you pay in compute to compress on one end, and decompress on the other.

    Keith

* My results say the compression is faster, but that the difference in results (< 2%) isn't statistically significant for less than 2 packets for XML.  I had to bump the number of tests run from 20 to 200 to get consistent results in the comparison, so it probably IS a tiny bit faster, I'd just have to run a lot more iterations to prove it.

Wednesday, July 10, 2019

What's your Point of View when writing code

Apparently I take code as a collaboration with the system I'm writing the code for.  I was reviewing my comments in some code I'd written, and all the comments were written first person plural.  We, us, et cetera.  When I review other code or documentation, it's all third person, the system, the component, et cetera.  I'm also writing for my team, and the we/us includes them in the conversation, but the we/us in the head is me and the computer.

Does anyone else have the same feel for this?  Do you comments talk to others, yourself, or you and the computer system that's running it?

   Keith







Wednesday, June 26, 2019

Set Theory Much? Yeah ... me too.


Not is sometimes Knotty, or perhaps nutty.
As I'm building out queries in my FHIR Server, I recall one of the challenges I had in interpretation of negation in relationship to tests the last time I did this. 

Here are a couple of queries, see if you can sort them out the same way I did.  For context, assume that you've got some labs, some radiology, and some other stuff, and perhaps the only way you can find the other stuff (e.g., an EKG) is that it isn't actually coded in any way.  That's sometimes the case for the other stuff after all.


DiagnosticReport?patient=99999&category:not=LAB,RAD
DiagnosticReport?patient=99999&category:not=LAB&category:not=RAD

If you recall, DiagnosticReport.category is multi-valued as well making it even more interesting.

Before diving in, let's talk about some queries and some data.  Perhaps you have some tests that are EKG results (neither labs, nor radiology).

Now, let's look at it the other way first:

DiagnosticReport?patient=99999&category=LAB,RAD
Returns any report where DiagnosticReport.category is coded using LAB, OR is coded using RAD, or is coded both ways.

Since category is a list (effectively a set of codes), the interpretation here is DiagnosticReport.category intersect (LAB, RAD) is non-null.  Another way to say this is |DiagnosticReport.category intersect (LAB, RAD)| > 0 (where |set| is the cardinality or size operator).

DiagnosticReport?patient=99999&category=LAB&category=RAD
Returns any report where DiagnosticReport.category is coded both as LAB, and as RAD.

And the interpretation here is DiagnosticReport.category intersect (LAB) is non null AND DiagnosticReport.category intersect (RAD) is non null.  We could also say DiagnosticReport.category is a superset of (LAB) AND DiagnosticReport.category is a superset of (RAD).  Which allows us to join this second one as DiagnosticReport.category is a superset of (RAD, LAB) or yet another way: |DiagnosticReport.category intersect (LAB, RAD)| = 2.

Now, throw :not at the problem, and it becomes knotty indeed.

DiagnosticReport?patient=99999&category:not=LAB,RAD
The way I want to read this is that DiagnosticReport.category contains neither LAB, nor RAD (or DiagnosticReport.category intersect (LAB, RAD) is null OR |DiagnosticReport.category intersect (LAB, RAD)| = 0.

But what then is this?
DiagnosticReport?patient=99999&category:not=LAB&category:not=RAD
Well, follow the logic (bomb).  DiagnosticReport.category is NOT a superset of (RAD, LAB), or yet another way |DiagnosticReport.category intersect (LAB, RAD)| != 2.  These are the reports that aren't both.

Did that all make sense to you?  Because I'm still scratching my head.

Oh but wait, there's more:  If DiagnosticReport.category is missing, does this work?  Actually, yes, because it would be returned for both queries using :not, which would be correct.

But that probably isn't how you thought you'd write those queries in FHIR, is it?  Yeah, me either.  Or is is it me neither?  Either way, I think I've got it right now (and written too).



Monday, June 17, 2019

Telling Time the HL7 Way

If you've never been to an HL7 Working group meeting, you'll run into some shorthand that long-time HL7'ers know that you'll have to catch up on.  The first is how we split up the day.

Officially, the day has 4 quarters, with breakfast, lunch and two breaks:


  • Breakfast starts 8-ish, and goes until 9:00.
  • Q1 goes from 9-10:30am
  • Morning Break is 10:30-11am.
  • Q2 is from 11am - 12:30pm
  • Lunch goes from 12:30-1:45pm.  There's plenty of time to each, call home, and take a short meeting.
  • Q3 is 1:45-3pm and is a "short" quarter by 15 minutes.
  • Cookie break is 3-3:30pm.
  • Q4 goes is 3:30-5pm.


A "Q0" meeting (not part of the official nomenclature, but still well-understood) is before breakfast, usually 7-ish, but could also be "overlapping" with breakfast.

"Q5" and "Q6" are generally "after 5" till about 6:30-ish, and after "Q5" till whenever...  This is often where some good work happens (some would even say "the real work").

If you are doing HL7 meeting stuff from Q0 to Q6, you still have 12 hours for your day job and sleep.  Your mileage may vary.

Monday after 5pm is the cochair's dinner.  If you want to hang with a cochair, they are likely busy from 5-7:30 or so Monday night.

Wednesday starting around 5:30 is the HL7 Reception.  This goes until about 7:30.

The first half of Monday in September is the Plenary session.

Monday and Tuesday at the WGM in January are the two Payer Summit days.

Connectathons are Saturday and Sunday before the Working group meeting.  Quarters?  Yeah, kinda.  We have them, food shows up at the right times.  But it's a Connectathon, software is ready when it's ready.  Some have been known to work until Q8 or 9, and maybe even start at Q -1.

I wanna say board meetings happen somewhere in Q3 and 4 on Tuesdays, but it's really up to the chair.

Technical Steering Committee (a governance committee) meets Saturday and Sunday.
International Council is Sunday and Thursday Afternoon.
Education Facilitators Lunch is Monday most meetings.

   Keith



Friday, June 7, 2019

What's your Field of View?

When you look at something under a microscope, what you see varies based on the level of magnification.  How much you can see and distinguish fine detail depends essentially upon your field of view.

One of the things that I've been looking at recently is personal health data stored in consumer apps and wearable devices.  Most of the details here amount to a FHIR Observation of some sort, with a code to describe the data element (and a value as a code, or quantity, or perhaps even a waveform).  We know that codes are computer friendly, but they aren't people friendly (and software developers ARE people, regardless of what others might tell you).

So, when everything is an observation, it gets messy for software developers who want nice, easy to remember mnemonics and JSON stuff that is focused right where they are focused.  Things that FHIR can capture and store, but maybe FHIR isn't actually the right place for those working in this space.

PCHA and Continua have some specifications in this space too, but again, NOT easy for developers to use, because once again, too much focus on the terminology, and not on what the developer is trying to do.

We need to find a way to move terminology out of the way.  Open mHealth looks like it's at a better place for this space, but folks who've invested heavily in FHIR and other standards don't agree.  But wait, what if those developers aren't my audience?  What then?

It all depends on your field of view.  And mine, as usual, is many and varied.

   -- Keith



Wednesday, June 5, 2019

Best practices for Logging and Reporting errors in FHIR

Over the years I've developed a number of micro-services implementing and using FHIR APIs.  I've developed a number of best practices for logging and reporting on errors that occur.  Some of these follow.


Logging


  1. If a call to your API is not validly formed, log this as a warning in your service's log.  You detected an error in user input, and handled it properly.  This is NOT an error in your application, it is an error in the calling application.  You DO want to WARN someone that the calling application isn't calling your application correctly.  You don't want to alarm them that your application isn't working right, because in fact, it is working just fine.
  2. If something happened in a downstream API call that prevents the proper functioning of your application (e.g., a database read error), this is improper operation of the system, and is an ERROR preventing your service from operating (even though there's nothing wrong in the service itself), and should be logged as such.  
  3. IF you implement retry logic, then:
    1. Log as warnings any operation that failed but finally succeeded through retry logic.
    2. Log as errors any operation that failed even after retrying.
  4. If an exception was the cause of an error, consider:
    1. If you KNOW the root cause (a value is malformed), say so in the log message, but don't report the stack trace. This will cut unneeded information from your logs, which you will be thankful for later. For example:
      try {
         int value = Integer.parse(fooQueryParameter.getValue());
      } catch (NumberFormatException nfex) {
         LOGGER.warn(
            "Foo query parameter ({}) must be a number.",
            fooQueryParameter.getValue());
      }
    2. If you don't know why the error occurred (there could be multiple reasons), do report the stack trace in the log:
      try (PreparedStatement st = con.prepareStatement(query)) {
         ResultSet result = st.execute();
      } catch (SQLException jex) {
         LOGGER.warn("Unexepected SQL Exception executing {}",
            query, jex);
         throw new InternalErrorException(...);
      }
    3. Consider pruning the stack trace at the top or bottom.  From the bottom because you know your entry points, infrastructure before that probably isn't that useful to you (e.g., tomcat, wildfly).  From the top because details after your code made the call that threw the exception isn't necessarily something you can deal with.  
  5. DO report the query used (and where possible, parameter values in the query) in the log. Consider also reporting the database name when using multiple databases. I have often seen database exceptions like "parameter 1 has invalid type" with no query included, and no values.
  6. Consider how you might implement retry logic in cases of certain kinds of exceptions (e.g., database connection errors).
  7. Use delimiters in your logging output format to make it easier to read them in other tools (e.g., spreadsheets).  I often use tab delimiters between the different items in my logging configuration: e.g.,
    %d{yyyy-MM-dd HH:mm:ss.SSS}\t[%thread]\t%-5level\t%logger{36}\t- %msg%n
  8. Consider reporting times in the log in a timezone that makes sense for your implementation (and more importantly, to your customer).  When your customer reports they had a problem at 9:33am, you want to be able to find issues at that time in the logs without having to compute offsets (e.g., from GMT ... do you know yours).

Reporting Errors in OperationOutcome

  1. Use 400 series errors like 400 or 422 when the fault is on the part of the client (e.g., invalid operation syntax, or unsupported query combination).  NOTE: HAPI on FHIR will report unsupported query combinations as a 500 series error (which I fix using an interceptor or filter).
  2. Use 500 series errors like 500, 503 or 507 when the fault is on the part of your service.
  3. DO tell the calling user what the problem is in easy to understand language, and if possible, include corrective action they can perform to address the issued.
  4. DO NOT include content from Exception.getMessage().
    I sometimes see this:
    catch (Exception e) {
      outcome.addIssue()
        .setSeverity(IssueSeverity.ERROR)
        .setDiagnostics(e.getMessage());
      throw ...
    }
    This is not good behavior.  You often have no clue what is in e.getMessage(), and often no control.  It can leak information about your technology implementation back to the API user, which can expose vulnerabilities (see below).
  5. DO NOT include the stack trace in the OperationOutcome.  This belongs in your logs, but not in the user response.  See the OWASP Error Handling cheat sheet.
  6. For database errors, you might want report the kind of database (e.g., patient chart, provider list), but not the exact name of the database.  Again, you want to be clear, but avoid leaking implementation details.

Use Error Codes

Finally, consider creating error codes (which can be reported to the user).  Report the error code WITH the human readable message.  The value of unique error codes is that:
  1. The error code does tell you where in your code the error occurred, but doesn't expose implementation details.
  2. Error codes can be associated with messages in ways that enable translation to multiple languages
  3. Error codes can also be associated with actions that users can take to correct the error (if it is on their part), or which your operations staff can take to either further diagnose OR correct the error.  For example:  DB001: Cannot access Provider database.
    Then, in your operations guide, you can say things like: "DB001: This message indicates a failure to connect to the Provider database".  Verify that the database services are up and running for the provider database for the customer site ...


Friday, May 31, 2019

Semantic Interoperability: What has FHIR Taught Us?

I've been working on expressing various queries using HL7 Version 2, Version 3, and FHIR.  What I've encounter is probably not shocking to some of you.

5 lines of HL7 Version 2 query encoded in HL7 ER7 format (pipes and hats), translates into about
35 lines in HL7 Version 2 XML, which translates into about
92 lines of HL7 Version 3 XML (about triple that of Version 2 XML), which translates into about
95 characters in one line of FHIR.

When expanded into version 3, a system that understands the principles of HL7 Version 3 can completely "understand" the semantics (but someone still needs to write the software to execute them).  In fact, I can turn an HL7 Version 3 XML representation into meaningful English, because the semantics of the message have been so well captured in it.

But, people don't talk the way that Version 3 does, nor do computers.  We both use three additional sources of information:

Context: Why are we talking? What's our purpose in having this discussion?  Who are we talking to? What are we talking about?
World Knowledge: This includes models of the world that describe the things that we are talking about.  What do we know?  What does it look like?
Inference: Given what we are told, and context and world knowledge, what can be inferred from the communication.  It is because of inference that Postel's law can be applied.  DWIM (Do What I Mean), that CPU operation we all wish was in our computers, can be applied when you know what it is the person could possibly have meant from all available choices.

HL7 Version 2 put the model in the message specification, but didn't tell you what it was talking about at a find enough grained level to make it possible for a human to understand the message.  It might be a language, but it's almost completely positional, with no human readable mnemonics to aid in understanding.
HL7 Version 3 put the detailed model in the message, rather than the message specification, and used a singular, 6 part model of the world (Act, Participation, Entity, Role, Act Relationship and Role Relationship) to describe everything.  V3 is clearly a language for computers to speak in, but not one for humans really. The message is completely detailed, but for a developer, there's so much repeated model to wade through that it's hard to find the point.  It's a language that talks about itself as much as it communicates.

FHIR builds on those details HL7 Version 3 models, but keeps that information as "World Knowledge" in the FHIR Specification (much the same way that Version 2 did).  What FHIR also did was provide for a representation that makes it easy to find the point.  There's just enough granularity in the message to find the code, the identifier, et cetera, for the thing you are looking for.

FHIR goes a step further in adoption of RESTful protocols, because the most common computer operation is to have it answer a question given certain inputs, and get back the answers.  And there's a protocol for how to do that that just about everyone using the web already understands.  It doesn't need OIDs (or anything with dots and numbers in it) to give it meaning.  We automatically know what a query parameter is.  And FHIR said "these are the kinds of things we need to be able to represent" when we ask questions.

Semantic Interoperability?  Pah.  I don't need semantic interoperability.  I need the damn thing to do what I mean, or better yet, do what I'm thinking.  FHIR at least, and at last, that I can think in.

     Keith



Friday, May 24, 2019

Who created this UI? It sucks!

As someone who writes regularly, I am often just as frustrated with Microsoft Word (or any other word processor I've ever used) as others report themselves to be with the user interfaces of EHR systems. Even Apple hasn't solved the problems I need solved.

How many clicks does it take to insert a figure reference to the figure below or above?  How much work is it to create a citation for the link I just inserted into the document?  These should be one button clicks, not the multi-step process they are today.

Why has this crime against writers continued to persist over decades? Nay, centuries... millennia even.

Word processor designers, here's a very clear specification for what I want:

Cross References

Given I have turned the option on, when I type the words "the figure|table|section below" or "the figure|table|section above" and there is a figure or table citation within the current section, insert a reference to it, or if a section, provide me with a list of sections to choose from that I can ignore if I want (so that if I continue typing, it just disappears).  And if I hit undo, treat the automatic insertion as the operation I want undone.

Hyperlinked Bibliography

Given I have turned the option on, when I insert a hyperlink, add new or reuse an existing citation source for the link if it already exists.  Find the individual author and creation date in the page data or metadata, or use a corporate author for the web site.  Include the URL in the citation.  If the URL includes a fragment identifier, find the text where that identifier appears and add it to the title of the reference (e.g., "Hyperlinked Biobligraphy in Who Created this UI? It sucks!).  If the link is to a page in a PDF (e.g., using #page=9 in PDF links) or other media format, treat it as a "document from a website", otherwise use "website" as the reference style.  Use the page title from the <title> tag in the page header.  Prompt me for missing information, but again, let this prompt dialog NOT interfere with my current work, and go away if I continue to type.  Same deal on undo here.  If I say undo, first undo the automatic insertion.

Finally: Stop turning on display formatting marks when I want to insert an index reference term.

   Keith


Friday, May 17, 2019

CDMA or GSM? V3 or FHIR? Floor wax or desert topping?

One of the issues being raised about TEFCA is related to which standards should be used for record location services.  I have to admit, this is very much a question where you can identify the sides by how much you've invested in a particular infrastructure that is already working, rather than a question of which technology we'd all like to have. It's very much like the debate around CDMA and GSM.

If you ask me where I want to be, I can tell you right now, it's on FHIR.  If you ask me what's the most cost effective solution for all involved, I'm going to tell you that the HL7 V3 transactions used by IHE are probably more cost effective and quicker to implement for all involved overall, because it's going to take time to make the switch to FHIR, and more networks are using V3 (or even V2) transactions.  And even though more cost effective for the country, it's surely going to hurt some larger exchanges that don't use it today.  CommonWell uses HL7 V2 and FHIR for patient identity queries if I can remember correctly, while Carequality, SureScripts and others use the HL7 V3 based IHE XCPD transactions ... which are actually designed to support federated record location.  As best I know, more state and regional health information exchanges support the IHE XCPD transactions than those exchanging data using V2 or FHIR.

Whatever gets chosen, it's gonna put some hurt on one group or another.  My gut instinct is that choosing FHIR is going to hurt a lot more exchanges that choosing XCPD at this time.

And this is where the debate about V3 and FHIR differs from the CDMA and GSM debate, because FHIR is closer to 4G or 5G in the whole discussion.  Some parts of FHIR, such as querying for Patient identity are generally widely available.  But complexity comes in when you get into using these transactions in a record location service, as I've described previously, and the necessary capabilities to support "record location services" in FHIR haven't been formalized by anyone ... yet.  This is where FHIR is more like 5G.

Just like 5G, this will happen eventually.  But do we really want to focus all of our attention on this, or do we want to get things up and running and give organizations the time they need to make the switch.  I think the best answer in this case is to make a very clear statement: This is where we are today (V3), and this is where we will be going in 2-3 years (FHIR), and make it stick.  And as I've said in the past, don't make it so hard for organizations to pre-adopt new standards.

Policy doesn't always work that way ... just look at what happened with ICD-10, or maybe even Claims Attachments.  But I think where we are at today is a little bit different, which is that the industry really wants to move forward, but would also like to have some room to breathe in order to move forward without stumbling along the way.  Do we really want a repeat of Meaningful Use?

We've seen how too much pressure can cause stumbles, and I think trying to use FHIR for record location services is just moving a little too fast.  I'll be happy to be proven wrong, and eat the floor wax, but frankly, right now, I just don't see it.

   Keith





Monday, May 13, 2019

Terminology Drift in Standards Development Organizations

I used to work for a company that published dictionaries, and one of my colleagues was a dictionary editor.  As he related to me, the definition of a term doesn't come from a dictionary, but rather from use.  A dictionary editor's job is to keep faithful track of that use and report it effectively.  By documenting the use, one can hope to ensure consistent future use, but languages evolve, and the English language evolves more than many.  I've talked about this many times on this blog.

It also happens to be the common language of most standards development organizations in Health IT (of course, I, as an English speaker, would say that, but the research also reflects that fact).

The evolution of special terms and phrases in standards is a particular challenge not only to standards developers, but especially to standards implementers.  As I look through IHE profiles (with a deep understanding of IHE History), I think on phrases such as "Health Information Exchange", "XDS Affinity Domain", and "Community", which in IHE parlance, all mean essentially the same thing at the conceptual level that most implementers operate at.

This is an artifact of Rishel's law: "When you change the consensus community, you change the consensus" (I first heard it quoted here, and haven't been able to find any earlier source, so I named it after Wes).

As time changes, our understanding of things change, and that change affects the consensus, even if the people in the consensus group aren't changed, their understanding is, and so the definition has changed.

We started with "Health Information Exchange", which is a general term we all understood (oh so long ago).  But then, we had this concept of a thing that was the exchange that had to be configured, and that configuration needed to be associated with XDS.  Branding might have been some part of the consideration, but I don't think it was the primary concern, I think the need to include XDS in the name of the configuration simply came out of the fact that XDS was what we were working on.  So we came up with the noun phrase "XDS Affinity Domain Configuration", which as a noun phrase parses into a "thing's" configuration, and which led to the creation of the noun phrase "XDS Affinity Domain" (or perhaps we went the other way and started with that phrase and tacked configuration onto it).  I can't recall. I'll claim it was Charles' fault, and I'm probably not misremembering that part.  Charles does branding automatically without necessarily thinking about it.  I just manage to do it accidentally.

In any case, we have this term XDS Affinity Domain Configuration, which generally means the configuration associated with an XDS Affinity Domain, which generally means some part of the governance associated with a Health Information Exchange using XDS as a backbone.

And then we created XCA later, and had to explain things in terms of communities, because XCA was named Cross Community Access rather than Cross Domain Access.  And so now Affinity Domain became equivilated (yeah, that's a word) with Community.

And now, in the US, we have a formal definition for health information network as the noun to use in favor of how we were using health information exchange more than a decade and a half ago (yes, it was really that long).

So, how's a guy to explain all this means the same thing (generally) to someone who is new to all this stuff, and hasn't lived through the history, and without delving into the specialized details of where it came from and why?  I'm going to have to figure this out.  This particular problem is specific to IHE but I could point to other examples in HL7, ISO, ASTM and OpenEHR.

The solution it would seem, would be to hire a dictionary editor.  Not having a grounding in our terminology would be a plus, but the problem there is that we'd a need a new one periodically as they learned too much and became less useful.



Thursday, May 9, 2019

It's that time again...

The next person I'm going to be talking about is responsible for open source software that has impacted the lives of tens millions of patients (arguably even hundreds of millions), tens of thousands (perhaps even hundreds of thousands) of healthcare providers, and certainly thousands of developers around the world.

The sheer volume of commits in the projects he's led well exceeds 50 million lines of code.  He's been working in the open source space for nearly a decade and a half, most of which has been supporting the work of the university hospital that employed him.

It's kind of difficult to tell a back story about him that doesn't give it completely away (and many who've used the work he's been driving already know who I'm talking about).  I'm told he's an accomplished guitar player, and I also hear that his latest album of spoken word and beat poetry will be coming out soon.

I can honestly say I've used much of the open source code he's been driving forward at four different positions for three different employers, through at least eight different releases, and I swear by the quality of the work that goes into it.  I'm not alone, the work has been downloaded or forked by several thousand developers all over the world.

I know that he sort of fell into this open source space a bit by accident as the person who had been driving one of the HL7 open source project moved on to greener pastures and he took up the reigns.  Since then, he took the simplicity and usability of that open source project into a second one that has driven HL7 FHIR on towards greater heights.  If it weren't for some of the work he's done, I can honestly state the FHIR community would have been much poorer for his absence.

Without further ado:


This certifies that
James Agnew of
Simpatico Intelligent Systems, Inc.



has hereby been recognized for keeping smiles on the faces of HL7 integrators for the better part of two decades.

HAPI on FHIR has is perhaps the most widely known Java FHIR Server implementations available, HAPI HL7 V2 has been used in numerous projects to parse and integrate with HL7 Version 2 messages, and is included in one of the most widely used open source V2 integration engines (formerly known as Mirth Connect, now NextGen connect).  James has also contributed to other open source efforts supporting HL7 FHIR and HL7 Version 2 messaging.  

Thursday, April 25, 2019

Record Location Services at a National Scale using IHE XCPD

One of the recent discussions coming up around the most recent TEFCA related specifications has to do with how one might implement record location services for patients at a national scale.  The basis for this is the IHE Cross Community Patient Discovery Profile (XCPD).

Here's the problem in a nutshell.  Assume you are a healthcare provider seeing a patient for the first time, and you want to find who else might have information about this patient?  How can you do so?

The first step obviously is to ask the patient who their prior doctor was, and here's where the first fundamental challenge appears.  Sometimes the patient is unable to answer that question, either at all, or at least completely.  So, then, how do you get a complete list?  What you don't want to do is ask everyone who ever might have seen the patient anywhere in the country, because that is not going to scale.

I think that about sums it up.

The IHE XCPD profile is designed to address this.

If the patient is only able to give a partial response, then you know where to start looking.  Here's the key point, once you know where to start looking, the organizations and networks who can answer the question can also point you to others who've seen the patient, and that can get you a more complete list, which eventually will lead to closure.

But wait! How do these organizations know who else has seen the patient?  It's really pretty simple.  Somebody asked them, and in the process of asking them, also told them that they would be seeing the patient, and so the original provider gains the information about the new provider seeing them, which makes them able to answer the question accurately for the next new provider.  And so the well known provider becomes more authoritative, while the new provider is able to provide equally authoritative data.

If the patient is unable to answer that question at all, then you have to figure out who else you might be able to ask that question of.  If the patient is local, you could ask others in the area who might know the patient.  If the patient isn't local (e.g., just visiting), you might try asking others near to where patient resides, which hopefully you can determine.  Since TEFCA is about a network of networks, it's reasonable to assume that there are some regional networks of whom you might ask about a given patient, and they might be able to ask other, smaller regional networks they know about (this could become turtles all the way down, but at some point, you'd expect to stop).

There are some other issues to address.  Just because we got the new provider and the old provider synchronized, doesn't mean everyone else is.  Who has that responsibility?  That's an implementation decision.  It could be the original provider, or it could be the new provider.  Since the new provider is gaining the benefit, one could argue it's there responsibility to notify other networks that have already seen the patient that they are now seeing the patient.  That would be the way I'd implement it.

Note: This doesn't have to be perfect.  It has to be good enough to work.  Perfecting the algorithm for record location to ensure the right balance of performance and accuracy in the RLS is going to take time.  But we can certainly build something that gets the right networks talking to each other.




Saturday, April 20, 2019

Why Software Engineering is still an art

Software engineering isn't yet a science.  In science, you have a bunch of experimental procedures that one can describe, and processes that one can follow, and hopefully two people can reproduce the same result (unless of course we are talking about medical research experiments ;-( ).

Today, I wanted to add some processes to my build.  I'm using Maven (3.6.0), with Open JDK 11.0.2.  I wanted to run some tests over my code to evaluate quality.  Three hours later, and I'm still dealing with all the weirdness.


  1. rest-assured (a testing framework) uses an older version of JAXB because it doesn't want to force people to move to JDK 8 or later.
  2.  JAXB 2.22 isn't compatible with some of the tools I'm using (AOP and related) in Spring-Boot and elsewhere.
  3. I have an extra spring-boot starter dependency I can get rid of because I don't need it, and won't ever use it.  It got there because I was following someone else's template (it's gone now).
  4. FindBugs was replaced with SpotBugs (gotta check the dates on my references), so I wasted an hour on a tool that's no longer supported.
  5. To generate my code quality reports, I have to go clean up some javadoc in code I'm still refactoring.  I could probably just figure out how to run the quality reports in standalone, but I actually want the whole reporting pipeline to work in CI/CD (which BTW, is Linux based, even though I develop on Windoze).
  6. The maven javadoc plugin with JDK 11 doesn't work on some versions, but if I upgrade to the latest, maybe it will work, because a bug fix was backported to JDK 11.0.3
  7. And even then, the modules change still needs a couple of workarounds.
In the summers during college, I worked in construction with my father.  Imagine, if in building the forms for the fountain in the center of the lobby (pictured to the right), I could only get rebar from one particular suppler that would work with the holes in the forms.  And to drill the holes, I had to go to the hardware store to purchase a special brand of drill.  Which I would then buy an adapter for, and take part of it apart in a way that was documented by one guy somebody on the job-site knew, so that I could install the adapter to use the special drill bit.  And then we had to order our concrete in a special mix from someone who had lime that was recently mined from one particular site, because the previous batch had some weird contaminates that would only affect our job site.

Yeah, that's not what I had to do, and it came out great.

Yet, that's basically exactly what I feel like I'm doing some days when I'm NOT writing code.  We've got tools to run tools to build tools to build components to build systems that can be combined in ways that can do astonishing stuff.  But, building it isn't yet a science.

Why is this so hard?  Why can't we apply the same techniques that were used in manufacturing (Toyota was cited)?  As a friend of mine once said.  In software, there's simply more moving parts (more than a billion).  That's about a handful of magnitudes more.


   Keith

Tuesday, April 16, 2019

Juggling FHIR Versions in HAPI

It happens every time.  You target one version of FHIR, and it turns out that someone needs to work with a newer or older (but definately different) version.  It's only about 35 changes that you have to make, but through thousands of lines of code.  What if you could automate this?

Well, I've actually done something like that using some Java static analysis tools, but I have a quicker way to handle that for now.

Here's what I did instead:

I'm using the Spring Boot launcher with some customizations.  I added three filter beans to my launcher.  Let's just assume that my server handles the path /fhir/* (it's actually configurable).

  1. A filter registration bean which registers a filter for /fhir/dstu2/* and effectively forwards content from it converted from DSTU2 (HL7) to the servers version, and converts the servers response back to DSTU2.
  2. Another filter registration bean which registers a filter for /fhir/stu3/* and effectively forwards content from it converted from STU3 to the servers version, and converts the servers response back to STU3.
  3. Another filter registration bean which registers a filter for /fhir/r4/* and effectively forwards content from it converted from R4 to the servers version, and converts the servers response back to R4.
These are J2EE Servlet Filters rather than HAPI FHIR Interceptors, b/c they really need to be right now. HAPI servers aren't really all that happy about being multi-version compliant, although I'd kinda prefer it if I could get HAPI to let me intercept a bit better so that I could convert them in Java rather than pay the serialization costs in and out.

In addition to converting content, the filters also handle certain HttpServlet APIs a little bit differently.  There are two key places where you need to adjust:

  1. When Content-Type is read from the request or set on the response, you have to translate fhir+xml or fhir+json to xml+fhir or json+fhir and vice versa for certain version pairs.  DSTU2 used the "broken" xml+fhir, json+fhir mime types, and this was fixed in STU3 and later.
  2. You need to turn off gzip compression performed by HAPI, unless you are happy writing a GZip decoder for the output stream (it's simple enough, but more work than you want to take on at first).
Your input stream converter should probably be smart and not try to read on HEAD, GET, OPTIONS or DELETE methods (because they have no body), and there won't be anything to translate.  However, for PUT, POST, and PATCH, it should.

Binary could a be a bit weird, I don't have anything that handles creates on Binary resources, and they WOULD almost certainly require special handling, I simply don't know if HAPI has that special handling built in.  It certainly does for output, which has made my life a lot easier for some custom APIs (I simply return a parameter of type Binary, with mimetype of application/json to get an arbitrary non-FHIR formatted API output), but as I said, I've not looked into the input side.

This is going to make my HL7 V2 Converter FHIR Connectathon testing a lot easier in a couple of weeks, because O&O (and I) are eventually targeting R4, but when I first started on this project, R4 wasn't yet available, so I started in DSTU2, and like I said, it might be 35 changes, but against thousands of lines of code?  I'm not ready for that all-nighter at the moment.

It's cheap but not free.  These filters cost in serialization time in and out (adding about 300ms of time just for the conformance resource), but it is surely a lot quicker way towards handling a new (or old) version for FHIR for which there are already HAPI FHIR Converters, and it at least gets you to a point where you can do integration tests with code that need it while you make the conversion.  This took about a day and a half to code up and test.  I'd probably still be at a DSTU2 to R4 conversion for the rest of the week on the 5K lines or so that I need to handle V2 to FHIR conversion.

   Keith



Friday, April 12, 2019

Multiplatform Builds

I'm writing this down so I won't ever again forget, in using a Dev/Build environment pair that is Windows/Unix, I remember to plan for the following:

  1. Unix likes \n, Windows \r\n for line endings.  Any file comparisons should ignore differences in line endings.
  2. Unix cares about case in filenames, Windows not so much.  Use lowercase filenames for everything if you can.
  3. Also, if you are generating timestamps and not using UTC when you output them, be sure that your development code runs tests in the same time zone as your build machine.
I'm sure there's more, but these are the key ones to remember.

   Keith

P.S.  I think this is a start of a new series, on Duh moments.


Wednesday, April 10, 2019

V2-to-FHIR on GitHub

The tooling subgroup of the V2 to FHIR project met a minor milestone today, creating the code repository for V2 to FHIR tools in HL7's Github.  If you want to become a contributor to this team, let me know, either here or via e-mail (see the send-me an email link to the right).

Our first call for contributions are sample messages including ADT, MDM, ORU, SIU and VXU messages from any HL7 V2 version in either ER7 (pipes and hats) or XML formats.  We'll be using these samples to test various tools to run the V2 to FHIR conversion processes in different tools in the May V2-to-FHIR tooling track.  There will be more information about that track provided on the O&O V2-to-FHIR tooling call on Wednesday April 24th at 3pm EDT

We are looking for real world testing data, rather than simple sample messages, with the kind of variation we'd expect to see in the wild.  If you have messages that you've used for testing your Version 2 interfaces, test messages for validating interfaces, et cetera, and want to contribute, we'd appreciate your sending them along.  You can either become a contributor, or send me a link to your zip file, or send me an e-mail with your sample messages, and I'll work on getting them into the repo.

No PHI please.  Yes, we are looking for real world data, but no, we don't want real world patient identities in here.  I know you know the reasons why, but I probably should say it anyway.

In contributing this data, you will be granting HL7 the rights to use this data for the V2 to FHIR project, just as you would with any other contribution you make to an HL7 project.


EHRs are ACID, HIEs are BASE

Phenolphthalein in FlaskI was talking about clinical data repositories, HIEs and EHRs with a colleague this morning.  One of the observations that I made was that in the EHR world, and some of the CDR world, folks are still operating in a transactional model, whereas most HIEs use data in an analytic fashion (although still often with transactional requirements).  There are differences in the way you manage and use transactional and analytical data.


Think about this.  When you ask an HIE for a document for a patient, are you trying to make a business (in this case, a health related care) decision?  Yep.  Is your use of this information part of the day-to-day operations where you need transactional controls?  Probably not, though you might think you want up to the minute data.

Arguably, HIEs aren't in the business of providing "up to the minute data".  Instead, they are in the business of providing "most recent" data within a certain reasonable time frame.  So, if the data is basically available, and eventually consistent within say, an hour of being changed, that's probably sufficient.  This is BASE: Basically Available, Soft (possibly changing) state, with Eventual consistency.

On the other hand, when you use an EHR, do you need transactional controls?  Probably, because at the very least you want two people who are looking at the same record in a care setting to be aware of the most current state of the data.  In this case, you need Atomic, Consistent, isolated as well as Durably persisted changes.  This is ACID.

BASE scales well in the cloud with NoSQL architectures. ACID not so much.  There are a lot of good articles on the web describing the differences between ACID and BASE (this is a pretty basic one), but you can find many more.  If you haven't spent any time in this space, it's worth digging around.

   Keith



Friday, April 5, 2019

Find your Informatics mentor at IHE or HL7

I was interviewed yesterday by a college student as part of one of her student projects.  One of the questions I was asked was: What would be your one piece of advice for a graduating student entering your field?

I told her that it would depend (isn't that always the answer?), and that my answer for her would be different than my general answer (because she's already doing what I would have advised others).

My general answer is to find an group external to school or work related to her profession to volunteer in, either  a profession association or a body like IHE or HL7.  I explained that these organizations already attract the best talent from industry (because company's usually send their top tier people to these organizations).  So, by spending time with them, she'll get insight from the best people in the industry.

Organizations like this also have another characteristic, which is that they are already geared up to adopt and mentor new members.  I think this mostly might be a result of the fact that they already have more work than they can reasonably accomplish, and having a new victim member to help them is something that they are naturally supportive of, and as a result, also naturally supportive of the new member.  It's an environment that's just set up to provide mentoring.

There are days when I'm actually quite jealous of people who get to do this earlier in their career than I did.  Participating in IHE and HL7 has given me, and many others quite a boost in our careers, and the earlier that acceleration kicks in, the longer it has to effect your career velocity.  In her case, I'm especially jealous, as she's been working in this space since middle school!

In any case, if you are a "newly" minted informaticist, health IT software engineer, or just a late starter like me, and want to give your career a boost, you can't go wrong by participating in organizations like IHE, HL7, AMIA or other professional society or organization.

   Keith

connectathon

Tuesday, April 2, 2019

How does Interfacing Work

This post is part of an ongoing series of posts dealing with V2 (and other models) to FHIR (and other models) which I'm labeling V2toFHIR (because that's what I'm using these ideas for right now).

I've had to think a lot about how interfaces are created and developed as I work on HL7 Version 2 to FHIR Conversion.  The process of creating an interface that builds on the existence of one or more existing interfaces is a mapping process.

What you are mapping are concepts in one space (the source interface or interfaces) to concepts in a second space, the target interface.  Each of these spaces is represents an information model.  The concepts in these interface models are described in some syntax that has meaning in the respective model space.  For example, one could use XPath syntax to represent concepts in an XML Model, FHIR Path in FHIR models, and for V2, something like HAPI V2's Terser location spec.

Declaratively, types in the source model map to one or more types in the destination model, and the way that they map depends in part on context.  Sure, ST in V2 maps to String in FHIR, but so does ID in certain cases, except when it actual maps to Code.

So, if I've already narrowed my problem down to how to map from CWE to Coding, I really don't need to worry much about those cases where I'd want to map an ST to some sort of FHIR id type, because, well, it's just not part of my current scope or context.

Thinking about mapping this way makes it easier to make declarative mappings, which is extremely valuable.  Declarations are the data that you need to get something done, rather than the process by which you actually do it, which means that you can have multiple implementation mechanism.  Want to translate your mapping into FHIR Mapping Language?  The declarations enable you to do that.

But first you have to have a model to operationalize the mappings.  Here's the model I'm working with right now:

Prerequisites


  • An object to transform (e.g., a message or document instance, or a portion thereof).
  • A source model for that object that has a location syntax that can unique identify elements in the model, from any type in that model (in other words, some form of relative location syntax or path).
  • A target model that you want to transform to.
  • A location syntax for the target model.
  • A set of mappings M (source -> target) which may for each mapping have dependencies (preconditions which must be true), and additional products of the mapping (other things that it can produce but which aren't the primary concepts of interest).


Dependencies

Dependencies let you do things like make a mapping conditional on some structure or substructure in the source model.  For example, a problem I commonly encounter is that OBR is often missing relevant dates associated with a report (even though it should be present in an ORU_R01, the reality is that it often is not).  My choices are to not map that message, or come up somehow with a value that is close enough to reality.  So, when OBR-7 or OBR-8 is missing, my goto field is often MSH-7.  So, how would I express this mapping?

What I'd say in this case is that MSH-7 maps to DiagnosticReport.issued, when OBR-7 is missing and OBR-8 is missing.  So, this mapping is dependent on values of OBR-7 and/or OBR-8.

Products

Products let you add information to the mapping that is either based on knowledge or configuration.  HL7 V2 messages use a lot of tables, but the system URLs required by FHIR aren't contained anywhere at all in the message (even though they are going to be known beforehand).  So, when I want to map OBR-24 (Diagnostic Service Section Identifier) to DiagnosticReport.category, I can supply the mapping by saying OBR-24 -> DiagnosticReport.category.coding.code and that it also produces DiagnosticReport.category.coding.system with a value of http://hl7.org/fhir/v2/0074.


Mapping Process Model

So now that you understand those little details, how does mapping actually work?  Well, you navigate through the source model by traversing the hierarchy in order.  At each hierarchical level, you express your current location as a hierarchical path.  Then you look at the path and see if you have any mapping rules that match it, starting first with the whole path, and then on to right subpaths.

ALL matching rules are fired (unlike XSLT, which prioritizes matches via conflict resolution rules).  I haven't found a case where I need to address conflict resolution yet, and if I do, I'd rather that the resolution be explicit (in fact, you can already do explicit resolution using dependencies).

If there's a match, then the right hand side of the rule says what concept should be produced.  There can only be a match when the position in the source model matches the concept that you are mapping from, and there exists an equivalent target concept in the model that you are mapping to.  In my particular example: Presuming that DiagnosticReport was already in context (possibly because I said to create it in the current context on seeing an ORU_R01 message type), then DiagnosticReport.category would be created.

At some point, you reach an atomic level with some very basic data types (string, date, and number) in both the source and target models.  For this, there are some built-in rules that handle copying values.

Let's look at our OBR-24 example a bit deeper.  OBR-24 is basically the ID type.  So, moving down the hierarchy, you'll reach ID.  In my own mappings, I have another mapping that says ID -> Coding.code.value.  This rule would get triggered a lot, except that for it to be triggered, there needs to be Coding.code already in my mapping context.  In this particular case, there is, because it was just created previously in the rule that handled OBR-24.  But if there wasn't, this mapping rule wouldn't be triggered.

When I've finished travering OBR-24 and move on to OBR-25, I "pop" context, and now that coding is no longer relevant, and I can start dealing with DiagnosticReport.status.

The basic representation of the mappings are FHIR Concept maps (as I've mentioned in previous posts in this series).



Clarified bullet point one above thanks to Ed VanBaak.

Thursday, March 28, 2019

Back to the Baselines

I've been working quite a bit on mapping V2 messages to FHIR lately.  One of the telling points in V2 conversion is ensuring you run tests against a LOT of data with a lot of variation, especially in the V2 interfacing world.

If you don't test with a lot of data, how can you tell that a fix in one place didn't break the great output you had somewhere else, especially given all the possible different ways to configure a V2 interface.

To do this, you have to establish baselines, and compare your test outputs against your baseline results on a regular basis.  Then, after seeing if the differences matter, you can promote your now "better" outputs as your new baselines.

Automating this process in code makes your life a lot easier.

I like to build frameworks so that I can do something once and then reuse it over and over.  For baseline testing, I decided that I wanted each test case I implemented to be able to store its outputs in folders identifying the test case in the form: testClass/testMethod/testInstance.  Those folders storing output would be stored in target/test-output folder.

And baselines would be stored in the src/test/baseline folder, organized in the same way.

Then I wrote a rather small method in the base class of my testing framework that did the following (FileUtils from Apache Commons IO is great for reading and writing the content):

1. Automated the generation of FHIR Resource output as json and xml files in the folder structure.
Here's some sample code using HAPI on FHIR to do that:

   FileUtils.writeStringToFile(new File(fileName + ".xml"),       xmlOutput = context.newXmlParser().setPrettyPrint(true).encodeResourceToString(b),
      StandardCharsets.UTF_8);


2. Compared the generated outputs to baselines.
   jsonBaseline = FileUtils.readFileToString(new File(baselineFile + ".json"), StandardCharsets.UTF_8);
   assertEquals(jsonBaseline, jsonOutput);

And finally, because HAPI on FHIR Uses LogBack, and Logback provides the Sifting Appender, I was also able to structure my logback.xml to contain a Sifting Appender that would store separate log files for each test result! The value of this is huge.  Logging is part of your application's contract (at the very least with your service team), and so if your log messages change, the application contract has changed.  So, if changing a mapping changes the logging output, that should also be comparable and baselined.

The sifting appender depends on keys in the MappedDiagnosticContext (basically a thread specific map of keys to values).  This is where we store the final location of the test log output when the test starts.  My code to start and end a test looks a bit like this:
try {
   start(messageName);
   ... // do the test 
} finally {
   end(messageName);
}

Start is a method that gets the test class and test name from the stack trace as follows:
Throwable t = new Throwable();
StackTraceElement e = t.getStackTrace()[1];
String fileName =
  String.format("%s/%s/%s", 
    e.getClassName(), e.getMethodName(), testName);

This is a useful cheat to partition output files by test class and method, and specific test instance being tested by that method (I use a list of files to read, any time I want a new test case, I just drop the file into a test folder).

End is a little bit more complex, because it has to wrap some things up, including log comparisons after everything else is done.  I'll touch on that later.

It's important in log baselining to keep any notion of time or date out of your logging, so set your logging patterns accordingly.  I use this:
[%-5level] [%t] %c{1} - %msg%n%xEx

While my normal pattern contains:
[%-5level] %d{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} [%t] %c{1} - %msg%n%xEx

My appender configuration looks something like this:

<Appender name="testing" class="ch.qos.logback.classic.sift.SiftingAppender">
    <discriminator>
      <key>testfile</key>
      <defaultValue>unknown</defaultValue>
    </discriminator>
    <sift>
      <appender name="FILE-${testfile}" class="ch.qos.logback.core.FileAppender">
        <file>./target/test-output/${testfile}.log</file>
        <append>false</append>
        <layout class="ch.qos.logback.classic.PatternLayout">
          <pattern>${timelessPattern}</pattern>
        </layout>
      </appender>
    </sift>
  </Appender>

The details out on log file comparison are a bit finicky, because you don't want to actually perform the comparison until the end of the test, and you want to make sure the logger has finished up with the file before you compare things.  After some code inspection, I have determined that logback presumes that it can dispose of the log after 10 seconds.

So, end looks something like this:
protected void end(String testName) {
boolean compare = "true".equals(MDC.get("compareLogs"));
LOGGER.info(FINALIZE_SESSION_MARKER, "Test completed");
MDC.put("testfile", "unknown");

if (compare) {
try {
// Wait for log to be finalized.
Thread.sleep(10 * 1000 + 100);
} catch (InterruptedException e) {
}
// Find and compare the files and assert if they don't match.
}
}

One other thing that I had to worry about was the fact that I use UUID.getRandomUUID().toString() in various places in my code to generate UUIDs for things that were being created.  I just replaced those calls to access a Supplier<String> that was part of the conversion context, so that I could replace it with something that had known behaviors for testing.

One last thing, if you build on both Windows and Unix, be sure that your file comparisons aren't sensitive to line ending format.  One way to address that is to replace \r\n with \n throughout after reading the strings from a file.  You might also find that UTF-8 / Windows Latin 1 characters are problematic depending on the character set your logging code assumes.  I generally stick with UTF-8 for all my work, but you never know about software you don't control.

   Keith

P.S. Yes, I do sing bass.

Experts don't always make the best teachers

To be an expert is different from being a teacher.  To be an expert one must amass a great deal of experience in a field.  This allows you to solve complex problems ... standards-based interoperability for example.

To be a teacher is a different mind-set.  Not only must you remember all the amassed experience, but you must also forget it ... or at least remember what it was like when you didn't know the answers, and if you are really good, the moment at which you finally got it, and then be able to convey that to others.

It's taken me ten years and more to become an expert at interoperability, and while I can claim some skill at teaching, I'm far from expert at it.  As I age, it becomes more difficult for me to remember what it was like to not know something.

Experts are often called upon to train others.  What is simple for us we must remember is not so simple for others without our experience.  And that is the critical piece of self-awareness that we have to learn to develop ... to recognize that there's a certain skill we had to develop, or piece of knowledge we had to slot into place in our minds before we could accomplish the "simple" task.

   Keith

Tuesday, March 19, 2019

When the NoBlocking regulation is more complex than software

... it's time to apply software tooling.

So I went through various definitions in the Informatin Blocking rule and made a UML diagram.  The value of this became immediately apparent to me when I was able to see for example that Interoperability Element, Health IT Module, and API Technology were somewhat broken.  API Technology is certainly a Health IT Module, and should be defined in terms of that definition.

It also shows the various relationships associated with actors.  As I go through the rule, I imagine there will be other relationships that I can infer from the regulatory text (e.g., fees charged to actors by other actors).

You can see the results below, and more importantly, you can get the source.


Entities (people, organizations, and things) are classes.  Things that can be done (verbs) are represented as interfaces.  The SVG representation links back to the regulatory text, and has mouse-overs citing the source of the link or artifact.

   Keith

Tuesday, March 12, 2019

How to File a HIPAA Privacy Complaint

I've been seeing a lot of tweets recently complaining about misuse of HIPAA (about a half-dozen).  Mostly from people who know better than doctors what the regulations and legislation actually says.
I tweet back, sometimes cc: @HHSOCR.  The volume's grown enough that I thought it worth while to write a post about it.

If your health care provider or insurer refuses to e-mail you your data, refuses to talk with you over the phone about your health data, or makes it difficult for you, there's someone who will listen to your complaint and will maybe even take action.  The HHS Office of Civil Rights is responsible for investigating complaints about violations of HIPAA.  They don't make the form easy to find (because frankly, they do have limited resources, and do need to filter out stuff that they cannot address), but they do support online complaint filing, and you can get to it online here (I've shortcut some of the filtration steps for you, if you've found this blog post, you probably meet the filter criteria).

Another way to complain is to write a letter.  I know it's old fashioned, but you can do it.  My 8-year-old daughter once wrote a letter to a HIPAA privacy officer.  You don't need to know their name, just the address of the facility, and address it to the HIPAA Privacy Officer.  It'll definitely get someone's attention.  And who knows, you just might change the behavior of the practice (my daughter's letter got the practice to change a form used to report on a visit so that it would be clearer for patients).

I've mentioned before that under the HIPAA Omnibus regulations, in combination with recent certification requirements, providers shouldn't be able to give the excuse that they are not allowed (under HIPAA) to e-mail, or haven't set up the capability to e-mail you your health data.  Those two statements are likely to be false ... but most providers don't know that (if you are reading this blog, you are probably among the exceptions).

I'd love it if HHS OCR provided a simple service that made it possible for patient's to report HIPAA nuisance behavior that would a) send the provider a nasty-gram addressed to the HIPAA Privacy officer at the institution with an official HHS logo on the front cover, and b) track the number of these sent to providers based on patient reports, and c) publicly report the number of nastygrams served to instititions when it reached a certain limit within a year, and d) do a more formal investigation when the number gets over a threshold, and e) tell them all that in short declarative statements:

e.g.,


To whom it may concern,

On (date) a patient reported that (name) or one their staff informed them incorrectly about HIPAA limitations.

The patient was informed that:
[ ] Healthcare data cannot be e-mailed to them.
[ ] Healthcare data cannot be faxed to them.
[ ] Healthcare data cannot be sent to a third party they designate.
... (a bunch of check boxes)

Please see HHS Circular (number) regarding your responsibilities regarding patient privacy rights.

Things you are allowed to do:
... (another laundry list).

This is the (number)th complain this year this office has received about your organization.  After (x) complaints in a year, your organization will be reported on http://www.hhs.gov/List-Of-Privacy-Nuisance-Violators.html.  After (y) complaints total, your organization will be investigated and audited.

Sincerely,


Somebody with an Ominous Sounding Title (i.e., Chief investigator)
/s/




I'd also love it if HHS would require the contact information for the privacy officer be placed on every stupid HIPAA acknowledgement form I've been "required" to sign (acknowledging I've been given the HIPAA notice ... which inevitably I refuse to sign until I get it), and on every HIPAA notice form I'm given.  Because I'd fricken use it. 

I could go on for quite some time about the pharmacy that couldn't find their HIPAA notice for ten minutes and refused to give me my prescription because I refused to sign the signature pad until they did so, only for them to finally discover that if they'd just given me the prescription, I would see it written on the back of the information form they give out with every medication ... but they didn't have a clue until someone made a phone call.  And of course they claimed I had to sign because "HIPAA" (which says no such thing).

I'd also love it if HSS authorized some sort of "secret healthcare shopper" that registered for random healthcare visits and audited the HIPAA components of a provider's intake processes for improvements (e.g., the HIPAA form in 6-point type at an eye doctor's office is one of my favorite stories, that's a potential violation of both HIPAA and disability regulations).  What the hell, make the payers actually be the ones responsible do it with some percentage of their contracted provider organizations, and report the results to HHS on a periodic basis.

I think this would allow us (patients) to fight back with nuisances of our own which could eventually have teeth if made widely available and known to patients.  I'm sorry I didn't think to put this in with my recent HIPAA RFI comments.  Oh well, perhaps another day, and in fact, since there was an RFI, there will be an NPRM, so these comments could be made there, and who knows, perhaps someone will even act on them.  I've had some success with past regulatory comments before.

   Keith


Monday, March 11, 2019

The Phases of Standards Adoption

I was conversing with my prof. about Standards on FB the other day, and made an offhand remark about him demonstrating that FHIR is at level 4 in my seven levels of standards adoption.  It was an off the cuff remark based on certain intuitions I've developed over the years regarding standards.  So I thought it worthwhile to specify what the levels are, and what they mean.

Before I go there, I want to mention a few other related metrics as they apply to standards.  One of these is the Gartner Hype Cycle with Innovation Trigger, Peak of Inflated Expectations, Trough of Disillusionment, Slope of Enlightenment, and Plateau of Productivity and Grahame Grieve's 3 Legs of Health Information Standards, and my own 11 Levels of Interoperability (which is really only 7).  There's a rough correspondence here, as shown in the table below.

PhasesDescriptionHype
Cycle
Grahame's 3‑Legs11 Levels of
Interoperability
Time (y)
-1 StrugglingAt this stage, not only does a standard not exist, but even awareness that there is a problem that it might solve is lacking.
 0 Absent
0 AspiringWe've identified a problem that standards might help solve and are working to solve it.
 Trigger 
1
 1 Aspirational
1-4
1 TestingThe specifications exist, and are being tested.
Peak
1 & 2
 2 Defined
½-1
2 ImplementingWorking prototypes have been tested and commercial implementations are being developed.

2 & 3
 3 Implementable
 ½-1½ 
3 DeployingImplementations are commercially available and can be used by end users.
Trough
 2 & 3
 4 Available
1
4 UsingCommercially available implementations are being used by real people in the real world.
Slope
3
 5 Useful
2-3
5 RefiningThe standard, and it's implementations and deployments are being refined.
Plateau
3
 6‑10 (not named) 
2-4

People are happy with the implementations, and should the question arise about what standard to use, the answer is obvious.


 11 Delightful
?


How are my seven levels of standards any different from the 11 levels of interoperability?  Not by much really.  What's different here, is that I've given phases instead of milestones.

Why this is important is because each phase occurs over time, and is entered into by different kinds of stakeholders according to a technology adoption lifecycle, and can have innovators, early adopters, majority adopters and laggards in each phase.

Time is interesting to consider here, because standards and technology has sort of a quantum nature.  It can exist in several of my phases described above at once, with different degrees of progress of in each phase, with the only real stipulation is that you cannot be further along in a later phase than you are in an earlier one.

If entry and exit to each phase was gated to completion of the phase before, the timelines for reaching refining stage would take about 5 years, but generally one can reach the starting point of the next phase by starting after the start of the previous phase by 3 to 6 months.  You may have more work to do to hit a moving target, but you'll wind up with a much faster time to market.

As Grahame points out, getting to the end of the cycle requires much more time in the market driving stage of his three-legged race than it does in the initial parts of it. 

Anytime I've done serious work on interoperability programs, I'm always working on 2-3 related projects in a complete program, because that's the only way to win the race.  You've got to have at least one leg in each place of Grahame's journey.  Otherwise, you'll reach a point of being done,  and simply expecting someone else to grab the flag and continue on without you.