Tuesday, November 24, 2020

A Moment of Silence

There's no award I could give that would fully recognize the contribution of Bill Majurski to the Cross Enterprise Document Sharing community.  That won't stop me from offering what surely is not enough, but my own recognition none-the-less.

Bill's favorite expression at IHE meetings discussing Cross Enterprise Document Sharing (XDS) was "I'm with the government, and I'm here to help you."  And he was, and he did.

XDS would not be the foundation our US national infrastructure today if it were not for Bill.  But even more so, it would not be the foundation of national and regional infrastructures around the world.  Bill and his team built the first XDS registry/repository system used for testing XDS in the first year, the first open source reference implementation, and for the next 15 years, Bill and his tools and teams were there supporting implementers at every IHE Connectathon to follow.

Sadly, as most of you already know, Bill won't be attending any more Connectathons except via TCP/IP over heavenly ether.  I've been hearing Bill's voice in my head as I have code exchanges using XDS and related specifications ... "That's not going to work...", "What I think we need to do is ...", and "Did you check to see if you ...".  I'll still be hearing it for the next 15 years.

I used to joke about the father and mother of XDS, and Bill was always "the mailman" (i.e., the true father).  But that joke doesn't seem right somehow.  Bill was more like the adoptive father, who nurtured, fertilized, pruned and cared for that initial seed, and let it grow into the mighty tree that it is, and from it which has now grown a tremendous forest enabling interoperability through the original specification, and many revisions and derivative works thereafter.

And forest is an apt description, as Bill was an avid fan of nature, of biking (the pedal kind), of hiking, and camping, and sharing that experience with the scouts he supported.

EVERY, and I mean EVERY interoperable exchange of a CDA document using nationally recognized standards in the US, and in dozens of other countries works in part because of something Bill did, or continued to do for 15 years afterwards.  If there were a national medal of honor for interoperability, it should go to Bill.  If there were an international recognition of the same esteem, it should go to Bill.

I don't have that capacity, I can only make him a Lord of the Ad Hoc Harley, but he always was one, and all I'm doing now is documenting it.  

He is hereby inducted into the 2003 Class of the Lords and Ladies of the Ad Hoc Harley, the year that the XDS Registry was invented by Bill.

Bill Majurski

Semper et semper ascendens deinceps
(ever and ever riding forward)

Monday, November 2, 2020

Data Blocking and Public Health Access through FHIR APIs

 A lot of EHR systems to date have established mechanisms to enable access to patient data via APIs, but there's one challenge that comes to the fore in working through these APIs in the context of accessing data for Public Health.

Many public health operations work with populations ... patients having a reportable condition, patients with COVID, patients with fill in the blank.

And many FHIR connectivity solutions presume that the "App" is going to be integrated via a Provider or Patient directed solution, launched by the EHR or as a standalone application, and will access data ONE patient at a time, and that patient will be known.

This is not so for some public health uses.  For those uses, the app is likely a "Back End Service", the user is likely a system, not a person, and there may not be a provider or patient context to identify what information the system should have access to.  FHIR Bulk data is a pretty good example for this kind of use, where a payer wants to collect information in bulk about patients it is concerned with (their panel of covered lives).  Public health also has such a panel, but not necessarily unique identifiers associated with the populations they are concerned with.  It may instead be a set of characteristics: Patients who are living in, or have been treated in, their jurisdiction.

And so, we see a conflict.  The current APIs attempt to ensure that patient data is not intermingled during API access (hey, this is not just a commonly done thing, it's also a common sense rule to ensure patient safety).  BUT, there needs to be a way to identify and collect data via APIs that allows public health to perform its duties.

Most of the Information Blocking Provisions are written with HIPAA as the context.  HIPAA identifies a few significant organizations: Covered Entities, Individuals (patients), Public Health, the courts and State/Federal government, to grant them specific rights or responsibilities.  Public Health generally isn't a covered Entity (in most cases, we'll leave the specific case of local public health providing treatment services out of this discussion -- which they often do).

And so where does Public Health fit with regard to Information Blocking?  Can a Provider organization, Health information network, or certified developer of Health IT information claim under the exceptions listed in 45 CFR 171 that there's a legitimate patient safety, privacy, security, feasiblility or performancee need to prevent public health access from asking the question in simplest FHIR form: Show me all observations where code is in the list of COVID-19 testing results AND date is today, and result is positive?

I think this question is quite debatable, and I've seen both sides of this coin.  I KNOW what my answer as an EHR Vendor was, is, and ever would be should I find myself in that role again, and that is NO.  EHR systems CAN provide that access, some vendors have done so, and I would do so again. BUT, if an EHR vendor doesn't plan for, design for, and account for this need, public health with have to sit in its usual place at the back of the line, before they can get access to critical data to do their jobs.


For what it's worth, I know every EHR has a way to provide access to population data from outside of FHIR APIs.  Arguably, if there is a way to do this safely, feasibly, and securely outside of FHIR, then it can be done using FHIR as well.

Programming Models, Validation and "Continuable" Errors

In writing software, there are two models for validating inputs.

  • Fail on First Error
  • Report All Errors
The first model assumes that an invalid input should halt processing, and be handled by some sort of error handling routine.  The second model assumes that it's better to continue, finding and reporting as many errors as possible to enable correction of all errors.

The first model is completely applicable in production environments, and is supported in programming language constructs by throwing and catching exceptions.  

The second model is applicable when performing validation testing, and enables reporting of not just the first, but all applicable errors found.  The FHIR OperationOutcome resource supports this model of error reporting.

Ancient software source code compilers USED to work the first way, but modern ones report as many errors as they can to enable software developers to correct as many of these errors as they can before trying again.

It makes me wonder if there shouldn't be a programming language construct to support the queuing of exceptions in some way.  

If I go back to modern compilers, and think about how they are able to continue, there's an error handling component that 
a) flags that an error has occurred to ensure that executable code isn't generated (at least at the location of the error ... the Java debugger can still execute classes and methods that contain compile errors in it), and 
b) "corrects" the input making reasonable assumptions to enable continuation of compilation.

Some things to consider in the construction of a construct to handle continuable errors:

  1. What are the boundaries for queuing and reporting of such a "continuable" error and how would these be set in a software application?
  2. What should happen if some OTHER error was detected that may have been a result of a continuable error?
  3. How would you ensure that software could continue to run after it detected one of these "continuable" errors.
I don't have an idea about how this would look, I'm just thinking about how it would make some jobs easier.


Tuesday, October 27, 2020

Models of Clinical Decision Support

This is mostly a thinking out loud piece for me to wrap my head around some thoughts related to the work that I've done on Clinical Decision Support, and how that relates to work done recently for the SANER Project, and how to connect that to ECR, and other public health reporting efforts. My first step in this journey is to review what I've already written to see how my approach to CDS, and that of standards has evolved over time.

Some of the more interesting articles I've written on this topic include:

Most relevant to this discussion is the three legged stool of instance data, world knowledge, and computational algorithms from my first article.
The biggest difference in most implementations of clinical decision support is in where the algorithm gets executed, and quite a bit of effort has been expended in this arena.  I originally described this by referencing the "curly brace" problem of ARDEN Syntax, and it describes a challenge of integrating the algorithm for computing a response with a way of accessing instance data.

Here are the key principles:
  1. Separate data collection from computation. (Instance Data from Algorithms)
  2. Use declarative forms that can be turned into efficient computation (Algorithms).
  3. Separate inputs from outputs (Instance Data, Actions and Alerts).
The tricky bit, for which I don't HAVE a principle is in how to identify the essential instance data, which honestly is largely driven by domain knowledge, and this is where MUCH of the nuance about implementing clinical decision support comes into play.  

There are two main approaches to clinical decision support: Integrating it "inside" an information system that has access to the essential data, or moving the data to an information system that can efficiently compute a result.

The former operates on the assumption that if you have efficient access to data, then compute locally (where the data results), and you can thus skip the need to separate instance data from algorithms that implement knowledge.  The latter requires the separation of instance data from the algorithm to facilitate data movement.

A large distinction between what SANER and Clinical Quality Measurement does from the rest of Clinical Decision Support is largely based on the distinctions between the needs for systems supported decision support based upon population data (data in bulk), and systems making decisions at the level of an individual.

It largely boils down to a question of how to access data efficiently. Different approaches to clinical decision support each approach this in a slightly different way.
  • Quality Reporting Data Architecture (QRDA) defines a format to move data needed for quality measurement to a service that can evaluate measures.
  • Query Health used Health Quality Measure Format (HQMF) to move a query described in declarative to a data source for local execution, and then results back to a service that can aggregate them across multiple sources.
  • HQMF itself has evolved from an HL7 Version 3 declarative form to one that is now largely based on the Clinical Quality Language (CQL) which is also a declarative language (and a lot easier to read).
  • Electronic Case Reporting (eCR) uses a trigger condition defined using the Reportable Condition Mapping Table (RCMT) value set to move a defined collection of data (as described in eICR) from a data source to the Reportable Condition Knowledge Management Service (RCKMS) which can provide a reportability response including alerts, actions and information.  RCKMS is a clinical decision support service.
  • CDS Hooks defines hooks that can be triggered by an EHR to move prefetch data to a decision support service using SMART on FHIR, which can then report back alerts, actions and other information as FHIR Resources.
  • SANER defines an example measure in a form which is represented by an initial query, and then filtering of that, using FHIRPath, which may result in subsequent queries and filtering.
One of the patterns that appears in many CDS specifications is about optimization of data flow.  There's an initial signal executed locally, which is used to selectively identify the need for CDS computation.  That signal is represented by a trigger event or condition, driven by either workflow, or a combination of workflow and instance data.  One example of a trigger event is the creation of a resource (row, database record, chart entry, FHIR resource, et cetera) matching a coded criterion (e.g., as in RCMT used with RCKMS for ECR). 

The trigger/collect/compute pattern is pervasive not just in clinical decision support, but in other complex (non-clinical) decision support problems that deal with complex domain knowledge.  It has uses in natural language processing software, where it has been used for grammar correction, e.g., to detect a linguistic pattern, evaluate it against domain (and language) specific rules, and then suggest alternatives (or verify correctness).  The goal of this approach is multi-fold: optimization of integration and data flow, and separation of CDS logic (and management thereof) from system implementation.

Population based clinical decision support is often expensive because it may require evaluation of thousands (or hundreds of thousands) of data records, and the more than can be done to reduce the number of records that need to be moved, the more efficiently and quickly such evaluations can be performed.  FHIR Bulk Data Access (a.k.a., Flat FHIR) is an approach to moving large quantities of data to support population health management activities.  It further accentuates the need for optimization of data movement to support population management.

As I think again through all of what has gone before, one of the things missing from my three legged model is the notion of "triggers", and I think these deserve further exploration.  What is a trigger event?  In standards this is nominally a workflow state.  From a CDS perspective, it's the combination of a workflow state, associated with resource matching a specific criteria.  The criteria is generally pretty straightfoward: This kind of thing, with that kind of value, having a measurement in this range, in this time frame.  And in fact, the workflow state is almost irrelevant -- but is usually essential for determining the best time to evaluate a trigger event.  Consider ECR for example, you probably don't want to trigger a reportability request until after the clinician has entered all essential data that you might want to compute with, at the same time, you don't want to wait until after the visit is over to provide a response.  Commonly this sort of thing might be triggered "prior to signing the chart", given that you want to make sure that the data is complete.  However, given that the results may influence the course of treatment or management, a more ideal time might be just before creation of the plan of care for the patient.

A few years back I worked on a project demonstrating the use of "Public Health Alerts" using the Infobutton profile and a web services created by John's Hopkins APL that integrated with an EHR system that was developed by my then employer.  We actually used two different trigger events, the first one being after "Reason for Visit" was known, and the second one just before physical exam, after all symptoms and vital signs had been recorded (if I remember correctly).  This was helpful, b/c the first query was relatively thin on data, but could guide data collection efforts if there was a positive hit, and the second one could pick up with a better data set to capture anything that the first might have missed.

I'm not done thinking all this through, but at least I've got a first start, I'm sure to write more on this later.

Monday, October 5, 2020

HL7 FHIR SANER Ballot Signup Closing October 19

I sent the following e-mail out to a subset of the SANER IG distribution list we maintain internally for folks who have been involved in development of The SANER Project.  I didn't bother to send to those who work for organizations had already signed up to participate in the ballot.  For those of you who have been following from afar, this is an opportunity for you to look more closely at what we've been doing for the past 8 months, and contribute your input!


As someone who has expressed interest, or having participated in the development of the HL7 FHIR Situation Awareness for Novel Epidemic Response Implementation Guides, we are letting you know that this document will soon be published for ballot. 

You will need to sign up BEFORE October 19th, 2020 to be included in the voting pool should you have interest in voting on this implementation guide in the next ballot cycle

To sign up to participate, go to http://www.hl7.org/ctl.cfm?action=ballots.home.  If you are an HL7 Voting Member for your organization, you will need to log in to see the ballots that you can vote on.  

If you are not a member, you can participate in an HL7 Ballot pool by paying creating an HL7 Profile and paying applicable administration fees (See http://www.hl7.org/documentcenter/public/ballots/2021JAN/Announcements/NonMember%20Participation%20in%20HL7%20Ballots%20Instructions.pdf for details).

Thank you all for your contributions, we have accomplished a tremendous amount of work over the last 8 months, and we hope to see you comments on this implementation guide.  Feel free to pass this information along to others you think should participate in voting on this implementation guide.

Keith W. Boone

Tuesday, September 29, 2020

The SecretLab Titan Chair (Review)

At one point in time in my life I used to manage the service department of a Computerland, and our particular store was chosen by the state to deliver custom software and accommodations enabling individuals with disabilities to use computers.  As part of this, I wound up spending a good bit of time investigating computer ergonomics, and have for some time paid significant attention to the ergonomics of my home workspace, especially after aging a bit necessitated a change in my setup.

I invested in a very good office chair, but that poor chair is not quite a decade old, and the foam on the seat has degraded to the point that I've had to put at least two additional cushions underneath.  It was time for a new chair.  I'm in this thing like eight hours on a regular day, and even more on some of my irregular days, and so money was less important to me than comfort (having thrown my back out twice in the last two quarters).  I spent the time investigating, and finally decided on getting myself the Titan chair from SecretLab (I get nothing if you click that link).

I wasn't impressed with stock on hand when I did my initial search (it was the smaller unit), and I didn't want to wait for a month or more for a new chair (I'm thinking that Amazon has spoiled me).  Fortunately, over the course of a week of research, they actually wound up with both the unit and colors I wanted for immediate shipping, which meant I'd only need to wait a week.  Your mileage may vary, but I can tell you, if I could have seen and tested the chair in a showroom, I would have waited a month, but it's difficult to wait a month for something that you aren't quite sure will do the job when you needed it last week.

The chair arrived today (one day early). UPS dropped it in the middle of my long driveway instead of bringing it to the door.  At about 70 lbs, I can understand why they didn't want to walk it another 50ft to my doorway.  The humor of me having to haul a 70lb box into my house by myself to help my back wasn't lost on me (my old chair only weighed about 45 lbs or so).

The unpack itself confirmed to me that I'd made the right choice in chair, and the build was also good.  The box was huge, well packed and padded, and came with high quality assembly tools (a Philips screwdriver that I'll likely be using for a while around the house, and steel hex wrench), and the four steel bolts you need for the bottom all packed neatly in a foam insert, instead of shrink wrapped plastic.  The chair itself went together in less than fifteen minutes, and I spent about a third of that time just moving it from the living room through the narrow hallway into my office (I probably should have done final assembly in my office).  

  • This thing is a monster.  The construction is much more impressive than my older office chair, but I would expect that having paid 2.5x as much for it.
  • It's the first chair I've ever owned where, when the hydraulics are all the way up, my feet DON'T lay flat on the floor, which is thrilling.
  • The placement of the included neck pillow, again, not a feature I thought I'd care about, but leaning back to think about something, that's perfectly placed and sized (and completely adjustable)
  • It has a reclining feature which I NEVER would have thought to include as a requirement for an office chair, but having tried it out, I can tell you that yes, I COULD fall asleep in this thing.
  • The armrests (also adjustable up, down, sideways and rotating) were perfectly placed as shipped.  I tried other orientations to be sure.  And again, I have room up, down, sideways or otherwise.
  • It rocks (and that's adjustable).  I twitch and bounce while in deep thought, and there's something comforting about a chair that responds instead of basically being a dead weight.  My old chair did too, but that was more wobble than rock.  And I can adjust from locked in place to free rocking with a simple click.
  • For a simple tweet, my 3 year warranty just became a 5 year warranty.
Frankly, so far, there are only three cons to this thing:
  • I had to wait longer than I wanted for it to arrive.
  • It's expensive [about $450 depending on the model] (but worth it).
  • It's heavy, so if you've done something stupid to throw out your back, have someone to help you move it around.
As with any piece of furniture, the true test will be time.

My youngest is laughing at me, b/c dad has a massive gamer setup for his "work". A powerful laptop, three displays (44" ceiling mounted, 37" desk mounted swivel side monitor, and my laptop screen with a 16" 4K display), a gamer keyboard with blinking LEDs (that I turn off), and to my side, my XBox with noise cancelling headset, and flashy rainbow light wired extra-button controller for when I want to kick back, and now a fancy gamer racing chair.

I could have gone with the D.Va model, but I don't play that character so well, and the pink would clash with my office décor.


P.S. A couple of you asked for a review, so here you have it.

Wednesday, September 23, 2020

Why computers should manage combinatorial explosion in test cases

 In A Test Case Generator for FHIR and SUSHI (and SANER) I wrote about how I'm working on generating test cases, and a little language for test case generation.

Here's one thing (among many) that I encountered.  Data for individual test cases should be dealt with independent from other test cases, so that tests don't interfere with each other.  That's why unit tests have setup and teardown.  But the test case generator is creating data for test cases that will be stored in a FHIR Server, and the FHIR Server cannot necessarily do set up and tear down between each test.

So, I cannot use the same patient for each test case, but rather, each test case must refer to patients created specifically for it, so that all the test data can be loaded into a FHIR Server for use at the same time.

It's a lovely little nuance about integration testing that you don't really have to deal with for unit testing.  I've accounted for it in data production for test cases, but it's made for some pretty interesting challenges, as I now have about four phases for parsing generating the data.

Parsing of the test model happens in the first phase. I have this working.
The generation step has at least three phases:
  1. Generating the essential resources and their variants.  I should probably talk about test cases and variants, and so will in more detail below.  This step has to be done in a particular order, because encounter cannot talk about patient or location until these two are defined (on purpose, so that I make the test case author deal with ordering, and I don't have to deal with forward references to stuff that doesn't exist).  I have this working.
  2. Generating Sushi code for each variant needed.  I have this working.
  3. Packaging a set of resources into a bundle for each variant of the test case.  I'm working on this now.

Test Cases and Variants

For my purposes, a test case is a package of data needed to test a measure groups: A bundle of resources.  I have a test case with an encounter, patient and location with the following linkages:

Patient patient (stands alone)
Location location (stands alone)
Encounter.patient  references Patient/patient
Encounter.location references Location/location

I have multiple measure groups, and I want to test the different facets of inclusion/exclusion criteria for the group.  So, an encounter in a test case might be "in-progress" or "entered-in-error".  These are two variations for the test case for one of the encounter measure groups.  If the test case is TestCase1, these variations will be labeled something like TestCase1a and TestCase1b, to distinguish them.  The bundles will be different for each variation.

Also, while each bundle might contain multiple linked resources (e.g., patient, location, and encounter),
the patient, location and encounter in each bundle must be distinct from the patient, location and encounter resources in other variants.  

So, with three possible variations on location, two in encounter, and one on patient, we'll see six (3 x 2 x 1) different cases for Encounter in the bundles.

Bundle1: TestCase1a
Encounter11.patient references Patient/patient11
Encounter11.location references Location/location/11

Bundle2: TestCase1b
Encounter12.patient references Patient/patient12
Encounter12.location references Location/location12

Bundle3: TestCase1c
Encounter13.patient references Patient/patient13
Encounter13.location references Location/location13

Bundle4: TestCase1d
Encounter21.patient references Patient/patient21
Encounter21.location references Location/location21

Bundle5: TestCase1e
Encounter22.patient references Patient/patient22
Encounter22.location references Location/location22

Bundle6: TestCase1f
Encounter23.patient references Patient/patient23
Encounter23.location references Location/location23

Tuesday, September 22, 2020

A Test Case Generator for FHIR and SUSHI (and SANER )

I've often heard the complaint of combinatorial explosion with respect to creating test cases to fully test a system.  The problem is acute.  One part of the solution is good analysis, but the other part of it is automation.

It must be my week for mini-languages, because here is another example of a mini-language, this time used for test case generation.  I think I might have caught the language virus.

TestCase Case1:
    Patient patient X 30 with (
        identifier.value = Identifier,
        identifier.system = "http://sanerproject.org/testdata/patients",
        name.given in "firstnames",
        name.family in "lastnames",
        gender in "genders",
        birthDate within '@1930-09-09' to '@2020-09-09'
    // This is a set of common last names, it is purposefully of prime length
    "lastnames": {
    // This is a set of first names that are gender free, also of prime length
    // and mutually prime with the set of last names.
    "firstnames": {
    "genders": {

This example says: Generate a test case (in a bundle) containing the resource "Patient" with identifier "patient" and do it 30 times.  Take the identifier.value from an autoincrementing counter.  Set the identifier.system to a fixed value.  Pull given and family names from predefined list of values, iterating over them until done.  Take gender from another list with only two codes.  Generate birth dates from a range of values.

Now, if that was all my language did, you'd not be terribly impressed (or at least I wouldn't be).

But, what if, you could generate multiple resources, and link them correctly by identifiers.  Now we are starting to get somewhere, but it still isn't all that much better than what one can already do with an excel spreadsheet (as we did manually for the first set of test data for SANER automation).

But I also need test cases where I have encounters with, and without reasonReference as variations, with with and without reasonCode values matching a certain value set, and observation and condition resources that match or don't match selection criteria.

So, what if I could specify variation within a field like this:
Patient patient1 
    /* as before */

Condition condition1 with (
   code in COVID19Diagnosis OR in NotACovid19Diagnosis,
   patient = patient1

Encounter encounter1 with (
   reasonReference = condition1 OR missing,
   reasonCode in COVID19Diagnsoses OR missing,
   subject = patient1

And what if the test case generator spit out were six different bundles where each bundle contained a patient1, condition1 and encounter1 meeting all the appropriate mixes of criteria.

condition1 with COVID19Diagnosis, 
encounter1 with reasonReference to condition1 and reasonCode in COVID19Diagnosis
condition1 with COVID19Diagnosis,
encounter1 with reasonReference to condition1 and reasonCode missing
condition1 with COVID19Diagnosis
encounter1 with reasonRererence missing and reasonCode in COVID19Diagnosis
condition1 with NotACovid19Diagnosis
encounter1 with reasonReference to condition1 and reasonCode in COVID19Diagnosis
condition1 with NotACovid19Diagnosis
encounter1 with reasonReference to condition1 and reasonCode missing
condition1 with NotACovid19Diagnosis
encounter1 with reasonReference to condition1 and reasonCode in COVID19Diagnosis

OK, now we are talking about something useful.
And if patient1 can vary (non-essentially) across these encounters, and so can encounter location, we've gone a step better.

Yes, this could produce a muck-ton of data.  But, the computer did it, you didn't have to.  All you had to do was give it the correct instructions to do something useful, and it produced something that you can use.

Anyway, more on this later as I continue my experiments.   So far, this was about two days of work, and what I have to show for it is this sample output:

Instance: patient3
InstanceOf: Patient
Description: "Generate sample patients with random characteristics"
* birthDate = "1964-06-09"
* extension[0].extension[0].url = "ombCategory"
* extension[0].extension[0].valueCoding = urn:oid:2.16.840.1.113883.6.238#2054-5 "Black or African American"
* extension[0].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-race"
* extension[1].extension[0].url = "ombCategory"
* extension[1].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-ethnicity"
* gender = #male
* identifier.system = "http://sanerproject.org/testdata/patients"
* identifier.value = "3"
* name.family = "Williams"
* name.given = "Drew"
* name.given[1] = "Taylor"
Instance: patient4
InstanceOf: Patient
Description: "Generate sample patients with random characteristics"
* birthDate = "1975-09-09"
* extension[0].extension[0].url = "ombCategory"
* extension[0].extension[0].valueCoding = urn:oid:2.16.840.1.113883.6.238#2076-8 "Native Hawaiian or Other Pacific Islander"
* extension[0].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-race"
* extension[1].extension[0].url = "ombCategory"
* extension[1].extension[0].valueCoding = urn:oid:2.16.840.1.113883.6.238#2135-2 "Hispanic or Latino"
* extension[1].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-ethnicity"
* gender = #female
* identifier.system = "http://sanerproject.org/testdata/patients"
* identifier.value = "4"
* name.family = "Brown"
* name.given = "Kennedy"
* name.given[1] = "Jordan"
Instance: patient5
InstanceOf: Patient
Description: "Generate sample patients with random characteristics"
* birthDate = "1986-12-09"
* extension[0].extension[0].url = "ombCategory"
* extension[0].extension[0].valueCoding = urn:oid:2.16.840.1.113883.6.238#2106-3 "White"
* extension[0].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-race"
* extension[1].extension[0].url = "ombCategory"
* extension[1].extension[0].valueCoding = urn:oid:2.16.840.1.113883.6.238#2186-5 "Non Hispanic or Latino"
* extension[1].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-ethnicity"
* gender = #male
* identifier.system = "http://sanerproject.org/testdata/patients"
* identifier.value = "5"
* name.family = "Jones"
* name.given = "Parker"
* name.given[1] = "Avery"
Instance: patient6
InstanceOf: Patient
Description: "Generate sample patients with random characteristics"
* birthDate = "1998-03-10"
* extension[0].extension[0].url = "ombCategory"
* extension[0].extension[0].valueCoding = http://terminology.hl7.org/CodeSystem/v3-NullFlavor#UNK "Unknown"
* extension[0].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-race"
* extension[1].extension[0].url = "ombCategory"
* extension[1].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-ethnicity"
* gender = #female
* identifier.system = "http://sanerproject.org/testdata/patients"
* identifier.value = "6"
* name.family = "Garcia"
* name.given = "Ryan"
* name.given[1] = "Brooklyn"
Instance: patient7
InstanceOf: Patient
Description: "Generate sample patients with random characteristics"
* birthDate = "2009-06-09"
* extension[0].extension[0].url = "ombCategory"
* extension[0].extension[0].valueCoding = http://terminology.hl7.org/CodeSystem/v3-NullFlavor#ASKU "Asked but no answer"
* extension[0].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-race"
* extension[1].extension[0].url = "ombCategory"
* extension[1].extension[0].valueCoding = urn:oid:2.16.840.1.113883.6.238#2135-2 "Hispanic or Latino"
* extension[1].url = "http://hl7.org/fhir/us/core/StructureDefinition/us-core-ethnicity"
* gender = #male
* identifier.system = "http://sanerproject.org/testdata/patients"
* identifier.value = "7"
* name.family = "Miller"
* name.given = "Cameron"
* name.given[1] = "Logan"

Monday, September 21, 2020

FluentQuery for FHIRPath

 The SANER Project has been using FHIRPath to define situational awareness measures.  It has need to query at least FHIR Server to evaluate data for the measure, and the server to query is something that can be defined by the implementer.  Although perhaps I should say "servers" to query, because it's possible that data may be available from more than one location, as a recent commentator on our measures suggested.

For example, in some placees, laboratory data may be available from multiple sources, both the hospital lab, and a health information exchange for example.  The existence of any lab data from either would be sufficient to rule a patient into (or out of) a measurement.  We are using the resolve() method to resolve the content of these queries (and yes, I have taken into account pagination in my own implementation of resolve), which leads to measures that have somewhat unreadable measures for those not deeply versed in FHIR search.  

Here's a couple of examples:

( %Base + 'Encounter?' +
  '_include=Encounter:subject&_include=Encounter:condition&' +
  '_include=Encounter:reasonReference' +
  '&status=in-progress,finished' +
  '&date=ge' + %ReportingPeriod.start.toString() +
  '&date=lt' + %ReportingPeriod.end.toString()

  %Base + 'Observation?_count=1' +
  '&status=registered,preliminary,final,amended,corrected' +
  '&patient=' + $this.id +
  '&date=gt' + (%ReportingPeriod.start - 14 'days').toString() +
  '&code:in=' + %Covid19Labs.url +
  '&value-concept:in=' + %PositiveResults.url
).resolve().select(entry.resource as Observation)

Sure, you can read these.  Of course you can, but what about your staff, leadership, or customers.  You'd probably have to explain that the first looks for Encounters, and their referenced subject, condition and reasonReference resources where Encounter is either in-progress or finished and the date is within the reporting period.  The second is an existence test that succeeds if there is at least Observation resource for a patient where the date of the observation is up to two weeks before the reporting period, showing a positive result on a COVID lab test.

But wouldn't something like below be MUCH easier for not just you, but also for your analysts, leadership and customers to read?


    for('patient', $this),
     'registered' | 'preliminary' | 'final' 
| 'amended' | 'corrected'),
    with('date').greaterThan(%ReportingPeriod.start - 14 'days'),

Of course it probably would.  The Reference Implementation of FHIRPath that Grahame developed provides support for custom functions, but unfortunately, custom functions don't have access (yet) to the left has side (the focus) of the expression.  I'm in the process of fixing that on my fork of his FHIRPathEngine code.

Once they do, here's how I see this working.  A query builder expression returns a URL that is a query being built, taking as input any part of the previously constructed query.  For the most part, these are simply specialized concatenations.

There are two functions: findAll() and whereExists() that start a query builder expression. These functions return a string containing the currently built URL.  findAll('resourceType') would simply return the string 'resourceType?', while whereExists(resourceType) would return 'resourceType?_count=1 (supporting an existence test).

The returned URL is evaluated by an onServers() function which is like resolve() except that onServers takes a list of base urls and the search is executed on each combination of search url and server (with pagination resolved).  In the case of "whereExists()", resolution is allowed to stop after finding the first matching resource on any server (although the queries might be executed in parallel).

onServers() is effectively equivalent to resolve().select(entry.resource), but there's one little bit of extra juice, because it can do this for more than one server at a time.  OnServers is the principle reason that my function set needs access to the focus, so that it can produce the cartesian product of of the URLs and the server base addresses.  And if there are multiple queries, onServers could be smart enough to send a batch query or use FHIR bulk data to get it's results.

The work above is really out of scope for SANER, although we might consider including it as an appendix for others to consider (that's probably the easiest way to address a couple of issues in the ballot wrt to scope, making sure we've got it documented, but not requiring it to be used for successful implementation).

One of the values of FluentQuery is that it provides a means for someone to write a query without it being completely depedendent on how the query URL is constructed.  The lightweight implementation can construct a URL as it goes (and that will be the first implementation that I write).  But other implementations could do some smart things like:

  1. Limiting queries to what a FHIR Server supports, and handling some of the filter parameters differently.  
    1. :in and :not-in
      Not every server supports code:in queries on Observation for example.  But it's a really valuable way to simplify the writing of the query.  
    2. _has
      Rewriting has queries for servers that don't support them.
      Resource1?_has:Resource2:name=value can be handled as first querying for Resource2?name=value&_elements=name with a post filter that collects all the references and then performs Resource2?_id=Reference1,Reference2, et cetera.
  2. Return lists that defer pagination of results until needed.

Search Functions

findAll(ResourceType [, QueryFunctionExpression]*)

The findAll function constructs a relative search URL for the specified resource type, appending the queries specified by any of the QueryFunctionExpression values separated by an &, and returns this in a string.  When executed, this query will return all results including those returned via pagination.

whereExists(ResourceType [, QueryFunctionExpression]*)

The whereExists function constructs a relative search URL for the specified resource type, adds _count=1 to it, and appends the queries specified by any of the QueryFunctionExpression values separated by an &, and returns this in a string.  This query is used for existence tests.

Query Execution Functions

onServers([reliability, ] Servers*)

The onServers function executes a search on the specified servers.  Servers is a list of fully qualified base URLs.  findAll queries resolve the data on all servers (including all pages from each server), and then selects entry.resource from all returned Bundle resources.  The whereExists query returns the first resource found after finding at least one match on any server, returning the first matching resource (and any included resources associated with it) .  Implementations are free to execute searches serially or in parallel.  

The optional reliability parameter indicates how to handle failures during a search, and can take the value of 'skip', or 'fail'.  Skip means if any server fails to respond, or responds with an error code (or exception of some sort), that the resolution should act as if a Bundle was returned with a singular OperationOutcome resource that describes the kind of error that occurred.  This ensures that queries succeed, but there may be missing data. If the value is fail, it indicates that if any server should fail to respond, the expression should throw an exception.  This allows expression writers some limited control over how to handle error conditions when a server is being queried.  Implementations should consider retrying failed queries in either case.

The following executions are equivalent:

Query Parameter Name Functions

The query parameter name functions start the first half of a query parameter to add to the query url.  They simply return the name as a string


The with function constructs the first half of a named query parameter.  It should be followed by a Query Parameter Value Function to construct the second half (the part containing the equals sign).

for(Name, Reference|Resource|Identifier)

The for function constructs a complete query parameter that matches a resource or identifier.

If the first parameter is a reference or resource id, the query parameter is written as Name=Reference.

If the second parameter is a resource, this is converted to a reference to that resource, and treated as above.

If the second parameter is an identifier, this is converted to a reference by identifier search, and written as Name:identifer=Identifier


The including function specifies what other resources should be included.  If any value in Names doesn't start with a resource type, it is prepended with the type of resource specified in _


has constructs the first half of a named query parameter supporting _has searches.  The value will be _has:Name.  Chained has are possible. It should be followed by a Query Parameter Value function to construct the second half (the part containing the equals sign).

Query Parameter Value Functions

Query parameter value functions produce the second half of a query parameter (the part containing the equals sign).

A value can be a primitive, Quantity, Coding, or CodeableConcept, or Reference type.

For Quantity type, value will be expressed in number|system|code form as required by Quantity parameters.  If system and code are empty, but unit is present, a Quantity value will be expressed as number||unit with no system.  If neither of system, code or unit are present, a quantity value will be expressed as number.

For Coding type, value will be expressed in system|code form as required of Token parameters.

Period parameter values work with Date, DateTime, Instant and Period data types.  A date promotes to an appropriate Period in these cases.


Appends =Values[1],Values[2],...,Values[n] to the query parameter.  Note that Values can repeat, or be a list, or both.  The following three expressions are equivalent:


equalToComposite(Value1, Value2)

Appends =Value1$Value2 to the query.


Appends =neValues[1],Values[2],...,Values[n] to the query parameter. Note that Values can repeat, or be a list, or both. The following three expressions are equivalent:


greaterThan(Value), greaterThanOrEqualTo(Value), lessThan(Value), lessThanOrEqualTo(Value), approximately(Value)

Appends =prefixValue to the query parameter, where prefix is gt, ge, lt, le, or ap appropriately.  Value must be a singular value.  

startsAfter(Period), endsBefore(Period)
If Period is any date type, promotes that to a Period first.
Appends =prefixValue to the query parameter, where prefix is sa or eb respectively, and Value is Period.start or Period.end respectively.


If Period is any date type, promotes that to a Period first.
Given name is the Query Parameter name, appends =gtPeriod.start&name=ltPeriod.end to the query parameter.  If end is not present (an open period at the end), then it only appends =gtPeriod.start to the query parameter (this is one of the functions that needs access to the focus)

not(token), text(token), above(token|uri), below(token|uri), in(uri), notIn(uri)

Appends :modifier=token|uri to the query parameter, where modifier is not, text, above, below, in, or not-in appropriately. 

Wednesday, September 16, 2020

Generating your own FHIR Narrative with XML using SUSHI

Building an implementation guide relies on a stack of transformations.  It's a good thing we know how to stack things up on top of each other these days.  But what happens when something in the stack isn't quite adequate?  How do you manipulate things to make it all work.

This example a particularly interesting challenge:  I want to be able to generate my own Narrative from the XML version of the FHIR Resource.  I won't have the XML when SUSHI is done, only after the IG Builder completes.  I can easily get the XML Version from the JSON version by running this little program:

    public static void main(String args[]) {
        IParser op = null;
        IParser ip = null;
        for (String arg: args) {
            IBaseResource r;
            File fin = new File(arg);
            File fout;
            if (arg.endsWith(".json")) {
                fout = new File(fin.getParent(), fin.getName().replace(".json", ".xml"));
                op = xp;
                ip = jp;
            } else if (arg.endsWith(".xml")) {
                fout = new File(fin.getParent(), fin.getName().replace(".xml", ".json"));
                op = jp;
                ip = xp;
            } else {
                System.err.println("Do not know how to convert " + arg);

            try (FileReader fr = new FileReader(fin); FileWriter fw = new FileWriter(fout)) {
                r = ip.parseResource(fr);
                op.encodeResourceToWriter(r, fw);
            } catch (IOException e) {
                System.err.printf("Cannot convert %s: %s\n", arg, e.getMessage());

But then, how do I insert my narrative into the build process?  If it has to happen
post-SUSHI processing, then IG-builder won't execute it.  I'm only doing this for ONE resource (or perhaps a few), not the 97 or so that I otherwise generate using SUSHI.  It will get infrequently updated, and I don't want to give up the convenience of having the automated SUSHI run in IG Publisher do most of the work for me.  

But hey, if you can make your transform stack run in a circle the way the Crazy Russian does with Pringles, you can make it work.

Here's what I decided to do:

In the Resource that I want generated text, I include the following:

Instance: ComputableCDCPatientImpactAndHospitalCapacity
InstanceOf: PublicHealthMeasure
Title: "Computable CDC Patient Impact and Hospital Capacity"
* insert ComputableCDCPatientImpactAndHospitalCapacityText

Then I create a RuleSet named ComputableCDCPatientImpactAndHospitalCapacityText and modify my XSLT to generate that ruleset instead of content of the DIV.  The template to generate the ruleset looks like this:

<xsl:template match="/">
    <xsl:text>RuleSet: </xsl:text><xsl:value-of select="fhir:Measure/fhir:id/@value"/><xsl:text>Text&#xA;</xsl:text>
    <xsl:text>* text.status = #generated&#xA;</xsl:text>
    <xsl:text>* text.div = """&#xA;</xsl:text>
    <div xmlns="http://www.w3.org/1999/xhtml">
        <xsl:apply-templates select="/fhir:Measure"/>

Finally, my process to update the generated narrative is to do the following:

  1. Generate the resources using SUSHI
  2. Run my XSLT to generate the Narrative
  3. Rerun the IG Publisher
Basically, I'm using SUSHI to generate an input to a second run SUSHI (run by the IG Publisher).

I'm not happy with this.  There should be a way to insert this generation step somewhere else in the tool chain, and update the tool chain (perhaps with templates) so that I don't have to rely on the FHIR Generated narrative, but can rather build my own from the XML resource.

It's a bit uglier than that even, because some of the resource content (e.g., description, definition, et cetera) is actually Markdown, rather than HTML, and I had originally used markdown tags to support bulleted lists or definition lists and links in some of that content where appropriate.  I punted on that problem by simply replacing that limited markup with its (again limited) HTML equivalents since HTML markup is allowed in Markdown.

Wednesday, September 9, 2020

The Bug that Almost Got Away

I discovered a tricky issue in bed counting while finishing up the automated measure for bed counts.

The tricky bit is that you what you are counting is beds (locations), but what you can find out about based on the reporting period (via a FHIR Query) is encounters, and that's what I almost counted.

Here's where it gets ugly.  A patient can have multiple encounters in the same day.  And they can be in different beds (locations) during the same encounter.

It can be worked out so long as as few simplifying assumptions can be made.

  1. Locations referenced by an Encounter are ordered from most to least recent.
  2. Patients can only be in one location at a time.
  3. Each location can handle only one patient.
  4. Encounters are ordered from most to least recent.
  5. If an encounter is active, then the most recent location is occupied by the patient.
Let L be a set of locations and encounters
For each active encounter E during the reporting period:
    if E.location.location is not in L
        Add E to L
        Add E.location.location to L
    end if
End For

At the end of the loop, L will contain:
  1. All occupied locations.
  2. The most recent encounter to cause that location to be occupied.
Here's almost the same thing in FHIRPath.

  iif($total.select(location[0]).location contains $this.location.location.first(),
      {},$total | $this)

Three lines of precious brevity. 

There were so many places I could have gone wrong, and I won't actually know it works until maybe Thursday.  But at least now I'll be counting occupied beds at the end of the period, rather than all beds that were occupations of beds during the period.

Tuesday, August 25, 2020

The Art of Writing Implementation Guides

The term Implementation Guide is a "term of art" in my world.  It has a particular, specialized meaning.  It's a document that principally tells its users how to implement and use a standard.

But if you get right down to it, the term itself also has a meaning that comes quite simply from the meaning of the words.  It's a guide to implementation.  Consider the key word here "Guide".  It's both a noun and a verb, where the noun describes one who "guides", and a guide is one who:

  1. leads or directs,
  2. exhibits or explains, or
  3. directs the course of another.
If you lead, direct, exhibit someone, without providing an explanation of why your course is a good one, you have failed.  Yet so many implementation guides leave out the rationale for doing things the way that the guide suggests.  This is the art of good implementation guide writing.

A simple formula for writing is "Do this, because that".  The "because" will help explain your rationale.  
Have consideration for the audience for your implementation guides.  Most of your readers will not have gone through the discourse that you have on the topic at hand.  A guide should explain why when the answer isn't immediately obvious so that users can follow your reasoning.  The big challenge for implementation guide authors is understanding what isn't immediately obvious.  Your reader isn't a five year old, the answer has to be better than "because I said so (or a team of experts said so)."  But as you write, do think like a five year old, and ask yourself why to go with everyone of your wherefore.

Consider the following example:
  1. A measure shall have a human readable name.
  2. The name shall be unique among measures published by the same organization and should be unique from the names of measures published by others.
Compare it with these instead:
  1. A measure shall have a human readable name that explains what is measured.
  2. The name shall be unique among measures published by the same organization so that users can distinguish between different measures.  It should be unique from the names of measures published by others for the same reason, but it is understood that this is not under the control of an individual publisher.

    It only takes a little bit more effort, but including your rationale does two things: It educates, explaining your reasoning to your audience, and it sells that audience on the constraints that your guide imposes.  It's much easier to get good implementations when your audience agrees with your reasons, and also remembers them.

    Sometimes a guide has to make arbitrary choices.  In these cases, simply explain that while there are two options, the guide chooses option A over option B to ensure that the thing is done in only one way.  Note that if there are two choices, A and B, and you've chosen A, you've said "Do A, NOT B".  It might be helpful to say it both ways as an aid to memory.  In these cases, express the positive case first because the addition of a negative adds cognitive effort.

    Two ways are commonly used to report an organism detected, however, this guide only allows for one of these to ensure consistency. This guide requires that the organism being identified be encoded in the test code, and the test result be encoded in the test value to ensure consistency among implementations.  An implementation shall not use codes which express a test for an organism, followed by a value describing the organism being tested for.

    If you must allow both choices, consider explaining why, and when it is appropriate to pick one vs the other.

    Client applications may use XML or JSON to interact with the server.  The client should choose the implementation format which best fits their processing model. JSON is more compact, but sometimes harder for a person to read.  XML is more verbose.  

    FWIW: I know this better than I follow it in my own writing.