Wednesday, June 26, 2019

Set Theory Much? Yeah ... me too.


Not is sometimes Knotty, or perhaps nutty.
As I'm building out queries in my FHIR Server, I recall one of the challenges I had in interpretation of negation in relationship to tests the last time I did this. 

Here are a couple of queries, see if you can sort them out the same way I did.  For context, assume that you've got some labs, some radiology, and some other stuff, and perhaps the only way you can find the other stuff (e.g., an EKG) is that it isn't actually coded in any way.  That's sometimes the case for the other stuff after all.


DiagnosticReport?patient=99999&category:not=LAB,RAD
DiagnosticReport?patient=99999&category:not=LAB&category:not=RAD

If you recall, DiagnosticReport.category is multi-valued as well making it even more interesting.

Before diving in, let's talk about some queries and some data.  Perhaps you have some tests that are EKG results (neither labs, nor radiology).

Now, let's look at it the other way first:

DiagnosticReport?patient=99999&category=LAB,RAD
Returns any report where DiagnosticReport.category is coded using LAB, OR is coded using RAD, or is coded both ways.

Since category is a list (effectively a set of codes), the interpretation here is DiagnosticReport.category intersect (LAB, RAD) is non-null.  Another way to say this is |DiagnosticReport.category intersect (LAB, RAD)| > 0 (where |set| is the cardinality or size operator).

DiagnosticReport?patient=99999&category=LAB&category=RAD
Returns any report where DiagnosticReport.category is coded both as LAB, and as RAD.

And the interpretation here is DiagnosticReport.category intersect (LAB) is non null AND DiagnosticReport.category intersect (RAD) is non null.  We could also say DiagnosticReport.category is a superset of (LAB) AND DiagnosticReport.category is a superset of (RAD).  Which allows us to join this second one as DiagnosticReport.category is a superset of (RAD, LAB) or yet another way: |DiagnosticReport.category intersect (LAB, RAD)| = 2.

Now, throw :not at the problem, and it becomes knotty indeed.

DiagnosticReport?patient=99999&category:not=LAB,RAD
The way I want to read this is that DiagnosticReport.category contains neither LAB, nor RAD (or DiagnosticReport.category intersect (LAB, RAD) is null OR |DiagnosticReport.category intersect (LAB, RAD)| = 0.

But what then is this?
DiagnosticReport?patient=99999&category:not=LAB&category:not=RAD
Well, follow the logic (bomb).  DiagnosticReport.category is NOT a superset of (RAD, LAB), or yet another way |DiagnosticReport.category intersect (LAB, RAD)| != 2.  These are the reports that aren't both.

Did that all make sense to you?  Because I'm still scratching my head.

Oh but wait, there's more:  If DiagnosticReport.category is missing, does this work?  Actually, yes, because it would be returned for both queries using :not, which would be correct.

But that probably isn't how you thought you'd write those queries in FHIR, is it?  Yeah, me either.  Or is is it me neither?  Either way, I think I've got it right now (and written too).



Monday, June 17, 2019

Telling Time the HL7 Way

If you've never been to an HL7 Working group meeting, you'll run into some shorthand that long-time HL7'ers know that you'll have to catch up on.  The first is how we split up the day.

Officially, the day has 4 quarters, with breakfast, lunch and two breaks:


  • Breakfast starts 8-ish, and goes until 9:00.
  • Q1 goes from 9-10:30am
  • Morning Break is 10:30-11am.
  • Q2 is from 11am - 12:30pm
  • Lunch goes from 12:30-1:45pm.  There's plenty of time to each, call home, and take a short meeting.
  • Q3 is 1:45-3pm and is a "short" quarter by 15 minutes.
  • Cookie break is 3-3:30pm.
  • Q4 goes is 3:30-5pm.


A "Q0" meeting (not part of the official nomenclature, but still well-understood) is before breakfast, usually 7-ish, but could also be "overlapping" with breakfast.

"Q5" and "Q6" are generally "after 5" till about 6:30-ish, and after "Q5" till whenever...  This is often where some good work happens (some would even say "the real work").

If you are doing HL7 meeting stuff from Q0 to Q6, you still have 12 hours for your day job and sleep.  Your mileage may vary.

Monday after 5pm is the cochair's dinner.  If you want to hang with a cochair, they are likely busy from 5-7:30 or so Monday night.

Wednesday starting around 5:30 is the HL7 Reception.  This goes until about 7:30.

The first half of Monday in September is the Plenary session.

Monday and Tuesday at the WGM in January are the two Payer Summit days.

Connectathons are Saturday and Sunday before the Working group meeting.  Quarters?  Yeah, kinda.  We have them, food shows up at the right times.  But it's a Connectathon, software is ready when it's ready.  Some have been known to work until Q8 or 9, and maybe even start at Q -1.

I wanna say board meetings happen somewhere in Q3 and 4 on Tuesdays, but it's really up to the chair.

Technical Steering Committee (a governance committee) meets Saturday and Sunday.
International Council is Sunday and Thursday Afternoon.
Education Facilitators Lunch is Monday most meetings.

   Keith



Friday, June 7, 2019

What's your Field of View?

When you look at something under a microscope, what you see varies based on the level of magnification.  How much you can see and distinguish fine detail depends essentially upon your field of view.

One of the things that I've been looking at recently is personal health data stored in consumer apps and wearable devices.  Most of the details here amount to a FHIR Observation of some sort, with a code to describe the data element (and a value as a code, or quantity, or perhaps even a waveform).  We know that codes are computer friendly, but they aren't people friendly (and software developers ARE people, regardless of what others might tell you).

So, when everything is an observation, it gets messy for software developers who want nice, easy to remember mnemonics and JSON stuff that is focused right where they are focused.  Things that FHIR can capture and store, but maybe FHIR isn't actually the right place for those working in this space.

PCHA and Continua have some specifications in this space too, but again, NOT easy for developers to use, because once again, too much focus on the terminology, and not on what the developer is trying to do.

We need to find a way to move terminology out of the way.  Open mHealth looks like it's at a better place for this space, but folks who've invested heavily in FHIR and other standards don't agree.  But wait, what if those developers aren't my audience?  What then?

It all depends on your field of view.  And mine, as usual, is many and varied.

   -- Keith



Wednesday, June 5, 2019

Best practices for Logging and Reporting errors in FHIR

Over the years I've developed a number of micro-services implementing and using FHIR APIs.  I've developed a number of best practices for logging and reporting on errors that occur.  Some of these follow.


Logging


  1. If a call to your API is not validly formed, log this as a warning in your service's log.  You detected an error in user input, and handled it properly.  This is NOT an error in your application, it is an error in the calling application.  You DO want to WARN someone that the calling application isn't calling your application correctly.  You don't want to alarm them that your application isn't working right, because in fact, it is working just fine.
  2. If something happened in a downstream API call that prevents the proper functioning of your application (e.g., a database read error), this is improper operation of the system, and is an ERROR preventing your service from operating (even though there's nothing wrong in the service itself), and should be logged as such.  
  3. IF you implement retry logic, then:
    1. Log as warnings any operation that failed but finally succeeded through retry logic.
    2. Log as errors any operation that failed even after retrying.
  4. If an exception was the cause of an error, consider:
    1. If you KNOW the root cause (a value is malformed), say so in the log message, but don't report the stack trace. This will cut unneeded information from your logs, which you will be thankful for later. For example:
      try {
         int value = Integer.parse(fooQueryParameter.getValue());
      } catch (NumberFormatException nfex) {
         LOGGER.warn(
            "Foo query parameter ({}) must be a number.",
            fooQueryParameter.getValue());
      }
    2. If you don't know why the error occurred (there could be multiple reasons), do report the stack trace in the log:
      try (PreparedStatement st = con.prepareStatement(query)) {
         ResultSet result = st.execute();
      } catch (SQLException jex) {
         LOGGER.warn("Unexepected SQL Exception executing {}",
            query, jex);
         throw new InternalErrorException(...);
      }
    3. Consider pruning the stack trace at the top or bottom.  From the bottom because you know your entry points, infrastructure before that probably isn't that useful to you (e.g., tomcat, wildfly).  From the top because details after your code made the call that threw the exception isn't necessarily something you can deal with.  
  5. DO report the query used (and where possible, parameter values in the query) in the log. Consider also reporting the database name when using multiple databases. I have often seen database exceptions like "parameter 1 has invalid type" with no query included, and no values.
  6. Consider how you might implement retry logic in cases of certain kinds of exceptions (e.g., database connection errors).
  7. Use delimiters in your logging output format to make it easier to read them in other tools (e.g., spreadsheets).  I often use tab delimiters between the different items in my logging configuration: e.g.,
    %d{yyyy-MM-dd HH:mm:ss.SSS}\t[%thread]\t%-5level\t%logger{36}\t- %msg%n
  8. Consider reporting times in the log in a timezone that makes sense for your implementation (and more importantly, to your customer).  When your customer reports they had a problem at 9:33am, you want to be able to find issues at that time in the logs without having to compute offsets (e.g., from GMT ... do you know yours).

Reporting Errors in OperationOutcome

  1. Use 400 series errors like 400 or 422 when the fault is on the part of the client (e.g., invalid operation syntax, or unsupported query combination).  NOTE: HAPI on FHIR will report unsupported query combinations as a 500 series error (which I fix using an interceptor or filter).
  2. Use 500 series errors like 500, 503 or 507 when the fault is on the part of your service.
  3. DO tell the calling user what the problem is in easy to understand language, and if possible, include corrective action they can perform to address the issued.
  4. DO NOT include content from Exception.getMessage().
    I sometimes see this:
    catch (Exception e) {
      outcome.addIssue()
        .setSeverity(IssueSeverity.ERROR)
        .setDiagnostics(e.getMessage());
      throw ...
    }
    This is not good behavior.  You often have no clue what is in e.getMessage(), and often no control.  It can leak information about your technology implementation back to the API user, which can expose vulnerabilities (see below).
  5. DO NOT include the stack trace in the OperationOutcome.  This belongs in your logs, but not in the user response.  See the OWASP Error Handling cheat sheet.
  6. For database errors, you might want report the kind of database (e.g., patient chart, provider list), but not the exact name of the database.  Again, you want to be clear, but avoid leaking implementation details.

Use Error Codes

Finally, consider creating error codes (which can be reported to the user).  Report the error code WITH the human readable message.  The value of unique error codes is that:
  1. The error code does tell you where in your code the error occurred, but doesn't expose implementation details.
  2. Error codes can be associated with messages in ways that enable translation to multiple languages
  3. Error codes can also be associated with actions that users can take to correct the error (if it is on their part), or which your operations staff can take to either further diagnose OR correct the error.  For example:  DB001: Cannot access Provider database.
    Then, in your operations guide, you can say things like: "DB001: This message indicates a failure to connect to the Provider database".  Verify that the database services are up and running for the provider database for the customer site ...