Healthcare Standards: 2021

Friday, December 3, 2021

New Years Resolutions for 2022

It's been a while since I've posted anything related to Health IT Standards here. I have three New Years Resolutions for 2022:

Don't schedule or accept any meetings on Friday afternoon.
Focus attention on Patient Administrative Burden (a phrase I read in a response to this recent ~~rant~~ thread from @HITPolicyWonk).
Do more Deep work and report on it here.

I've been doing a bit of reading (of the non-fiction philosophical variety) lately.

Algorithms to Live By by Brian Christian and Tom Griffiths
This is an awesome collection essays about the application of computer algorithms to various aspects of life, and what they have to teach us about the tasks we don't always involve computers in. I think what I enjoyed most about this book was the creative links the authors made between algorithsm (most of which I know well), and everyday tasks and processes.
Thinking, Fast and Slow by Daniel Kahneman
This book endeavors to explain how decisions get made and are influenced by the conscious and unconscious processes in our brains, and the various tendencies that are influenced by our intuitions, the biases that they introduce, and to some degree, methods by which one can eliminate some of those biases from our thinking.
The Most Human Human by Brian Christian
I picked this one up because I've been interested in Artificial Intelligence since early college years, and the Turing test always amused me, and because I enjoyed an earlier book by the same author. There are some interesting musings here that again cross between computing and everyday life. Compared to Algorithms to Live By, this one wasn't quite AS interesting, but a fun read none-the-less.
Deep Work by Cal Newport
I'm still ploughing through this one, but after 3 chapters, I'd heartily recommend it. That's saying something because it's VERY hard to find a book I cannot finish over a weekend, but this one involves that kind of reading. It's deep work to understand deep work. I haven't been doing enough deep work lately, and it makes me sad. Hopefully, I'll learn enough to fix that.
Clean Code by Robert C. Martin
A classic, and one that I recently re-read, a decade + later. I don't fully agree with everything said in this book b/c Robert misses a key point. There's concepts and nuance and a whole language associated with so many programming frameworks, and in many of these, the frameworks are so large that even the best engineer can't keep them all available to them. As part of a team that has a common approach, following the guidelines in this book are valuable. But what the book won't do for teams is enable a new person to join the team without prior experience in the framework and be able to understand the code as written. I recently read through several thousand lines of an application built by a very skilled engineer following the principles in this book, and because I know where to find the right stuff, was eventually able to understand it, but having never used one aspect of that framework in a production environment, found myself lost until I could go do some useful reading. A simple comment describing the design pattern or framework in use would have made the code much easier to understand. Do NOT underestimate the longevity of your code. All too many times in my life I've encountered code where the person who wrote is no longer living, the person who took it over is retired, and basically, there's some grunt who can compile, build and occasionally fix bugs in the code, and the ideas, frameworks, thoughts and standards of two decades ago are completely unknown to the originators. As developers, we live on internet time. What is obvious to us today is forgotten 5 years from now because someone figured out a better way. Be kind to them who have forgotten everything that is in the front of you mind and obvious (to you). Still useful reading, but do so with a critical eye. Like any other concept that is at least a decade old, there's some new thinking that has replaced some of these ideas.

Saturday, November 27, 2021

On Air: Part Three

I'm finally finished with my On Air lamp prototype. The goal of this project was to build something that would indicate when I have the camera and mic rolling in my office without any intervention on my part (basically, all I need to do is start a meeting and it goes on, or end it and it goes off). I've managed that.

It took me quite some time to figure out how to get Windows 10 to tell me when the microphone is in use. The key is in the registry keys located in the registry hive for the current used found in SOFTWARE\Microsoft\Windows\CurrentVersion\CapabilityAccessManager\ConsentStore\microphone.

In this hive there is at the top level, a hive of keys for each Microsoft Store app, and beneath it, another hive of keys found in the NonPackaged hive.

If you inspect the registry with RegEdit, you will see that each hive under that key looks something like this:

The key (pun intended) key is LatUsedTimeStop. When this value is 0 for any application, that application is presently using the microphone. When it's done, that value will be a Windows timestamp (the number of 100 nanosecond intervals since 1/1/1601 UTC).

I wrote a little Java program called OnAir that scans the registry, checks for the Microphone being in use, and sends the appropriate messages to the wireless controller to adjust the state of the lamp.

I tested it with almost all of my headsets (I have four in my office of various types), and it works with any of them, and with any application that uses a microphone.

To make this code work I needed to be able to access the registry from Java. There's a lovely little Java package that does this called JNA. The dependencies to include it are below.

<dependency>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
<version>5.10.0</version>
</dependency>
<dependency>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna-platform</artifactId> <version>5.10.0</version>
</dependency>

I suppose I could have used this as an opportunity to write in .Net, but realistically, I was just looking to get the lamp off my desk, and to do that, I needed to finish the prototype.

There's so much more I could do to finish this. The application could use a UI to indicate the state of the lamp, and to activate/deactivate it for certain applications, and indicate which application is using the microphone. The app itself should run as a service. The electronics could be cleaned up so that I don't need two USB cables, and the wireless component could be mounted inside the lamp. For now, it's good enough for what I wanted, so I'm calling it done, and declaring "Ship it!".

Thursday, October 14, 2021

Responding before reading @alissaknight's: Playing with FHIR: Hacking and Securing FHIR APIs

If you've been sitting under a rock, you missed the latest big explosion in the FHIR World, Alissa Knight's most recent report: Playing with FHIR: Hacking and Securing FHIR APIs. I've downloaded it, and looked at the fantastic graphics (e.g., image right), but I've not really read it yet. This is the kind of report that will require as deep a read I suspect as some Federal regulations. If I get the time (perhaps this weekend), I may even do a tweet-through-read-through.

I have some initial thoughts based on my reading of the tweets in this stream:

“(FHIR app ecosystem) contained “pervasive authorization vulnerabilities” that enabled Knight to access > 4M patient and clinician records with just a single patient login account” This is truly frightening. Would love your perspective @motorcycle_guy @amalec @aneeshchopra https://t.co/EOHKNvzQT0
— claudiawilliams (@claudiawilliams) October 13, 2021

Here's what I expect to find, noting that this is only my guess as to what is happening:

Various apps rely on a FHIR backend which will allow a replay attack to be performed whereby:

The attacker obtains access to the authorization token used by the FHIR API call to make other API calls on a different patient.

NOTE: There are a number of ways to obtain this authorization token depending on how the application is constructed and the level of access one has to developer tools, and application hardware and software. Assume that the hostile attacker is one in a million that has access to all of that, not the common high-school student (but don't count them out either, some of them are that good). Is ia a Java or .Net app? There's a debugger for that, and I can almost assuredly final all of your assemblies or jar files on my device emulator, and debug code running in a device. Did you obfuscate your code? If not, I can reverse compile it, there are tools for that too, and even obfuscation is not enough when you consider that platform calls are still going to be obvious, and all the important ones are well known, so I can work back the stack to the code I need to manually investigate.
The attacker constructs new API calls to make request using the same authorization token.
The call succeeds because the only check that is performed on the authorization token by the back end server is that it is a valid token issued by the appropriate authorizer.

The fix is not simple. Basically, the back end developer needs to bind the access control restrictions associated with the original access request to obtain the authorization token to the subset of data that it authorizes access to (this is the easy part), and enforce them on every request (this is the hard part), incoming and outgoing (unless you can prove the outgoing data will match the authorization based on how the query works). JWT tokens provide more than adequate capability to bind exactly those details in an easily accessible way. Essentially, the claims in the JWT should indicate what is allowed to be accessed, when, how, and by whom. Sadly for my first FHIR Server, JWT was still a work in progress, but I simply encoded an index in the token which pointed to memorialized claims in a database, constructed when the token was first created.

Once you can do that much, fine-grained access control is actually quite simple. That was originally considered to be out of scope in my first effort for reasons of cost and complexity, but because we had to build the right security infrastructure even to support coarse-grained access control, fine-grained control simply became just a little bit harder, but also a worthwhile differentiator from other offerings.

As I said earlier, my guess as to what is wrong is only supposition, an experienced, wild-assed guess at what I will find upon reading. But if I can guess it before reading, then arguably, so can someone else as smart as I without barely even an incentive. Consider also that intelligence and quality of character are not necessarily correlated, nor do they even need to exist in the same body for a vulnerability such as I described above to be readily and easily exploited by someone with enough motive, and knowledge about how to use tools that people smarter than them wrote.

In the world of the great world wide web, FHIR API servers, Java, .net or other platform, the basic concepts of servlets, intercepters, filters, hooks, cut points or similar concepts by other names in non-Java platforms all exist and are able to perform these checks and validations both before and after the call completes, so that you can:

Verify that what is being asked for is allowed to be asked for (never assume that the querant of your back end can only be your application)
The data that is being returned also matches what the authorization allows the end user to see, and either filter, or simply reject the request (after having performed some work that you wish you hadn't).

I led teams building (writing some of the code myself) commercial FHIR servers on top of readily available open source frameworks a number of times in my life (most are still in use), and I've also implemented other API interfaces pre-FHIR. When I built in security controls, I bound the security tokens to the specifically authorized purpose and sources, and I didn't trust that token to be used only for that purpose, I made sure it was verified every time.

I have to tell you that was also a moderate pain point, because it cost about 10% of my processing time budget. I could however, justify it by citing the specific regulatory text regarding fines for breach, and that I think made a huge difference in my ability to argue for that kind of security (this was perhaps, the hardest part).

I expect to be reading a wakeup call, we'll see how my prediction turns out. Remember too, APIs are still a work in progress.

Security is hard.
Good Security is harder.
Great Security is an ongoing work in progress.
Perfect security does not exist.

On Air: Part Two

Since my initial foray into developing my own little IoT device that will be controlled by my headset that started in On Air: Part 1, I've managed to accomplish three things:

Discovered that the battery drain from the light is too much for as often as I'm on a headset so that batteries would bankrupt me (three batteries lasted two days). Fortunately, there's a USB C power adapter for this fixture.
Rewired the power connection so that it was driven by the power to thee on-off switch, rather than power supply from the battery.
Written a couple of node.js scripts to turn the beasty on and off from the command line.

In part 3 or 4 I'll open this thing up to show you my (crappy) wiring.

Step 3 was not nearly as hard as I expected, and that's because everything I need to control the device is right here.

The two scripts are dead simple. Here's the one to turn it on. The one to turn it off is left as an exercise for the reader.

const ewelink = require('ewelink-api');

/* instantiate class */
const connection = new ewelink({
email: '█████████@██████',
password: '███████████',
region: 'us',
});

async function f1() {
/* turn on/off device */
const device = await connection.setDevicePowerState('██████████', 'on');
console.log(device);
}

f1();

First, create a connection class, and give it your username (which is your e-mail address), your password, and your region. Mine happens to be 'us'.

Secondly, create an async function that will turn the device on and report its status. You'll need the device id, fortunately, it is written on a white sticky attached the controller chip of the device, so get your phone or a magnifying glass to read it, and enter it (in quotes) as the first argument to the call to setDevicePowerState().

Finally, call that function.

To run the script, just enter:

node turnOn.js

at the command line, and walla, the light goes on. Run

node turnOff.js

and it turns off.

Now, to detect whether my headset is in use, and call the right script. This will also be remarkably easy. But the devil is in the details.

I suppose I could simply this script so that it took a command line argument, but again, that's an exercise for the reader.

Tuesday, September 28, 2021

On Air: Part one

My home office has a wide open arch entry way right off the main hallway between the living room and kitchen, and is also the route from kitchen to the upstairs master bathroom which all the people in my house prefer to use for their showers. My family understands that when they see me with a headset on that I may very likely be on a zoom/webex/teams/pick-your-favorite-teleconfering-app call. I have two cameras in my office that I might use for these calls, one with a face view, and one with a side view (the laptop camera) due to where my laptop sits. The side view points to the arch, and I rarely use it, sadly, if the front facing camera gets disconnected, that one is always available and becomes the default camera.

Imagine not being able to, in your own home, wander from your bedroom to the bathroom wrapped in a towel, and having to worry about being on camera in the background. Not ideal. I've put a sticky over that camera for now to address this challenge. But since I'm on calls so much, I often wear my headset most of the day (it also helps notify me of incoming requests for attention). And so, to avoid the inevitable question "Are you on a call", I thought it might be a good idea to get myself a wirelessly controlled, battery operated on-air light, and then have it be controlled by detecting use of a headset with my computer.

Yeah, this is NOT a commercially available solution, but all the pieces are out there. You can get an On Air light, a 5V DC operated WiFi switch, and a battery pack to create what I call "On Air: Part one". The parts below (or similar) are what you need.

Take the back cover off the light, and cut the red wire going from the battery case to the on-off switch. Route two wires so that they can get out the back (I simply cut a hole in the back of the light), and connect to the newly cut power feed. Connect the wire from the battery case itself to the center (common) position of the relay using the new wire (routed through the back), and connect the second wire to the on-off switch to the bottom terminal of the relay. Insulate the newly made connections (I used heat shrink tubing, but electrical or other tape will work). Connect the new battery pack to the WiFi switch. Install batteries in both battery packs.

Download the App that works with the switch you purchased and follow the directions to connect the relay to the app. You MAY need to reconfigure your router to separate your 2.4Ghz network from your 5 Ghz network to make everything work (I did). Test the switch in the manufacturer's selected App.

Tell Google Home about the new switch you installed.

OK, at this point, you've now got a remote controlled On Air lamp. In part two, I will show you how to control that from your PC, and in Part three, I'll explain how you can detect use of a headset on your laptop. I have three different headsets I can use, and one works with both my phone and my laptop, and the other with my iPad and my laptop, and the third with any device I plug in into. The software solution I put together should work on when any of these devices is used for communication. I'll leave it to you to guess how I make that work (when I get there).

Friday, August 20, 2021

Stratifying Race and Ethnicity for SANER

Variation is the bane of standards. Eliminating needless variation is part of my job. Doing it in a way that doesn't increase provider (or developer) burden is an indication that it's been done right.

I've looked at a lot of state and national dashboards while working on the SANER Project, and one thing I notice is the variation in reporting for data with respect to race and ethnicity classifications (strata). Often, when reported on publicly, these two different categories are combined into smaller sets, with groupings like multiple race, other and unknown.

ONC National Coordinator Micky Tripathi noted Health IT reporting variation for this kind of data in his keynote delivered at a recent Strategic Health Information Exchange Collaborative (SHIEC) conference.

Federal reporting uses separate fields for race and ethnicity, and allow for multiple values to be reported for race. There are 5 possible values for race (not counting various flavors of unknown), and two values for ethnicity according to OMB Reporting requirements.

Reporting multiple races means that there are several ways to report none (flavors of null including unknown, refused to answer and did not ask), 5 ways to report one race, 10 to report two races, 10 ways report three races, 5 ways to report four, and 1 way to report all five, resulting in around than 33 categories.

Combining that with the various ways to report ethnicity (again with flavors of none), that results in about 165 possible reporting categories. Looking at the actual statistics, there are about 50 categories that would generally be needed for a given facility (e.g., frequency > a few tenths of a percent) to stratify populations according to race and ethnicity, if non-existing groupings are not reported on, and perhaps an even smaller number for smaller facilities. It wouldn't be possible for example, for a 100 bed hospital to even use all of the category combinations.

The data is generally rolled up into a much smaller number of reporting categories which vary between states, and these often also vary with how federal dashboards report the same data. Different states have different racial and ethnic makeups and the public reporting at race and ethnicity data at these levels is designed to address potential disparities relevant to the state.

Given that many state departments of health also support reporting to federal agencies, how does one normalize reporting without having to have 51 separate specifications for reporting?

The best way to handle this is to stratify by the combination of race and ethnicity, and report all possible existing combinations. In other words, don't report 0 values for combinations that don't exist, as that can be inferred from the data. This enables states to roll up this data into a smaller set of categories for their public reporting, yet retain the data needed for federal reporting, and enable federal reporting to roll up differently. When automatically computed, this level of stratification does not introduce a reporting burden on the reporting providers.

Tuesday, August 3, 2021

Thinking Ahead

A very long time ago, when I worked at Florida State University, I had two rules for programming written on my board:

Just get it to work.
If it works, don't mess with it.

It became a test of the quality of people who would read it, as the ones who suffered from some sort of reaction while interpreting the ramifications were definitely the people that I wanted to have around me.

If you just get it to work, and don't mess with it, you have something that has the very least effort put into it. If you need to change it in the future, good luck with that.

Along the same lines, when I was growing up, a kid who lived on my street had a small car, I think it was a Dodge of some sort, and he wanted to beef it up. So he took out the engine and tranny, and replaced it with a MUCH bigger one. Somehow he managed to make it all fit together after modifying the drive shaft, but he forgot one very important thing: Engine torque. When he finally started the car after spending the better part of a year on this project, he wound up warping the frame, something like the picture to the right. The car was a total loss. The point of this story is that you can only do so much with a limited infrastructure.

One of the challenges with FHIR adoption for some use cases is that there are existing HL7 interfaces, labs, ADT feeds, immunizations, et cetera, that are already widely deployed, adopted, and working, in fact, working well. There's little desire to replace these interfaces with FHIR based interfaces because:

What we have is good enough for what we are doing.
It's working right now.

But as we keep pushing the interoperability needle higher and higher, eventually we will have replace these interfaces. When should we do that?

When what we have isn't good enough for what we want to do.
Or we can't make want we want to do easily work with what we have right now.

The HL7 V2 to FHIR project is an example of what happens when interfaces get stuck in these situations, we cannot easily connect them to newer infrastructure so that we can do more with them, so we build things that enable us to convert from one to the other. The very existence of the project demonstrates that there's more than we want to be able to with the data present in HL7 Version 2 messages. This might include things like:

Aggregating data from multiple sources
Providing more sophisticated searching capabilities
Enabling data subscriptions

There's a lot of effort and cost associated with replacing something that works with something else, and it's hard to justify that when the thing that's working is in fact still working. But, if there was a way to upgrade, replacing your scooter with a Corvette (and you can justify the need for a Corvette), then it might in fact be worthwhile.

When interface standards are mandated by regulatory policy, it's pretty difficult to upgrade. Consider what happened with X12 5010 standards, or the whole discussion around CCDA 1.1 and CCDA 2.1 backwards compatibility. It's even more difficult when it all has to happen in a very short time frame. We need to consider how to have policy enable these kinds of shifts, over REASONABLE time frames. Two years is not enough time to roll out a new standard without severely impacting an industries capacity to do anything else but roll that out. We know that from experience (or at least I hope we do).

But what would the next generation ADT, lab, immunization or other standards look like? And what would they enable us to do that the current ones don't. It's time to start thinking about that.

Monday, August 2, 2021

YAML as a FHIR Format

YAML (short for "YAML Ain't Markup Language" but I simply prefer "Yet Another Markup Language") is a file format that even further simplifies writing structured data files. A while back I struggled with writing measures for the SANER Project because both XML and JSON formats have some minor issues that make it hard to hand code the expressions.

XML sucks because, well, XML sucks. Whitespace is really valuable for formatting code, but XML just wants to make it to be gone.

JSON isn't much better because you have to escape newlines and once again, cannot see your code structure for expressions.

I dinked around a bit with YAML input and output, and now because I'm creating a new measure, wanted to get it working properly, which I have now done, at least in so far as my YAMLParser now correctly round trips from JSON to YAML and back to JSON.

The key to making this work is using Jackson to convert between JSON and YAML, and configuring the YAML quotes right so that strings that look like numbers (e.g., "001") don't get treated incorrectly as numbers when converting between the two.

The methods newYAMLMapper() creates a Jackson YAMLMapper correctly configured.

public static YAMLMapper newYAMLMapper() {
YAMLMapper m = new YAMLMapper();
return
m.enable(YAMLGenerator.Feature.LITERAL_BLOCK_STYLE)
.disable(YAMLGenerator.Feature.MINIMIZE_QUOTES)
.disable(YAMLGenerator.Feature.USE_PLATFORM_LINE_BREAKS)
.disable(YAMLGenerator.Feature.SPLIT_LINES);
}

Methods for converting between YAML and JSON are fairly simple:

public static String fromYaml(String yaml)

throws JsonMappingException, JsonProcessingException

{

ObjectMapper yamlReader = newYAMLMapper();

Object obj = yamlReader.readValue(yaml, Object.class);

ObjectMapper jsonWriter = new ObjectMapper();

return jsonWriter.writeValueAsString(obj);

}

public static String toYaml(String jsonString) throws IOException {

// parse JSON

JsonNode jsonNodeTree = new ObjectMapper().readTree(jsonString);

// save it as YAML

String jsonAsYaml = newYAMLMapper().writeValueAsString(jsonNodeTree);

return jsonAsYaml;

}

Converting from streams and readers works similarly.

The YamlParser class implements IParser. It contains an embedded jsonParser convert resources back and forth between Java classes and JSON formats, and then uses toYaml and fromYaml methods in encodeResourceToString and parseResource methods to read/write in YAML format. It's NOT the most efficient way to read/write YAML to FHIR, but it works (correctly as best I can tell).

public static class YamlParser implements IParser {

private final IParser jsonParser;

YamlParser(FhirContext context) {

jsonParser = context.newJsonParser();

}

@Override

public String encodeResourceToString(IBaseResource theResource)

throws DataFormatException

{

try {

return toYaml(jsonParser.encodeResourceToString(theResource));

} catch (IOException e) {

throw new DataFormatException("Error Converting to YAML", e);

}

@Override

public <T extends IBaseResource> T

parseResource(Class<T> theResourceType, InputStream theInputStream)

throws DataFormatException {

try {

return jsonParser.parseResource(

theResourceType, fromYaml(theInputStream));

} catch (IOException e) {

throw new DataFormatException("Error Converting from YAML", e);

}

...

}

All of the setter/getter methods on YamlParser delegate the work to the embedded JsonParser, as shown in the examples below.

       @Override
public void setEncodeElementsAppliesToChildResourcesOnly(
           boolean theEncodeElementsAppliesToChildResourcesOnly) {
jsonParser.setEncodeElementsAppliesToChildResourcesOnly(
               theEncodeElementsAppliesToChildResourcesOnly);
}

@Override
public boolean isEncodeElementsAppliesToChildResourcesOnly() {
return jsonParser.isEncodeElementsAppliesToChildResourcesOnly();
}

A full blown implementation can be found at YAML Utilities for FHIR

Monday, June 28, 2021

Help Adrian Get Treatment

I'm asking this at the request of my eldest adult child, going by Aeowolfe these days. Some of you may have met him a few years back or seen him help me out in some videos for Health IT in the past.

My eldest child's partner has been suffering from chronic pain since mid-2017. The onset of the disease occurred while they were at College were my eldest met them. Since then, they lost the support of their family, have subsequently been diagnosed with Fibromyalgia, hypermobility disorder (similar to Ehlers-Danlos syndrome, only missing direct family member diagnosis to match the clinical criteria), and endometriosis. Our family has been providing support in various ways, including helping them get signed up for Mass Health and disability through Social Security (which took over 3 years to get to).

They've found a treatment plan they think will help (Ketamine-Lydocaine transfusions), but they cannot afford them at their present income, and insurance doesn't yet cover this treatment because it is new (there have been several clinical trials with positive results). So my eldest has started a gofundme for Adrian. If you are able and willing to contribute you will have both my thanks, and that of my eldest.

Wednesday, June 16, 2021

Creating your FHIR Artifact JIRA Specification

When you go to ballot, or any form of publication for a FHIR IG through HL7, you have to provide an XML document that defines the pages and artifacts that the HL7 Jira system will use for reporting issues or comments on the specification.

If you are smart, you create an initial template in the HL7/JIRA-spec-artifacts page when you start your project. If you aren't, you wait a while and then create it.

The initial version should look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<specification
gitUrl="https://github.com/HL7/fhir-project-mhealth"
url="https://hl7.org/fhir/uv/mhealth-framework"
ciUrl="http://build.fhir.org/ig/HL7/fhir-project-mhealth"
defaultWorkgroup="mh" defaultVersion="0.1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="../schemas/specification.xsd">
<version code="0.1"/>
<artifactPageExtension value="-definitions"/>
<artifactPageExtension value="-examples"/>
<artifactPageExtension value="-mappings"/>
<page name="(NA)" key="NA"/>
<page name="(many)" key="many"/>
<page name="(profiles)" key="profiles"/>
</specification>

But you will need to add lines for each artifact (Profile, ValueSet, et cetera) and page (.md or .html file) that you generate.

This can be awfully tedious, but the IG Publisher can create an updated one for you, although if you haven't created one in JIRA-spec-artificact, for some reason it doesn't seem to create an initial one for you. I'm not clear on why it doesn't, but I found a workaround.

What you do is create that initial version, and copy it to your templates folder, naming it jiraspec.xml. Then you run the IG Builder without a vocabulary server.

C:\myproject> JAVA -jar "..\%publisher_jar%" -ig ig.ini -tx n/a

Telling the IG Builder that you don't have a vocabular server makes it assume that you do NOT have an internet connection, and so it also doesn't try to get your templates or copy the current JIRA Spec over that file I had you create above. Now when you run the IG Builder, it will create an initial JIRA spec file for you, which you can then generate a pull request to https://github.com/HL7/JIRA-Spec-Artifacts/xml.

Once you've finished, you can find the created specification in your project in template/jira.xml, which you can then rename appropriately and send a pull request to the JIRA-spec-artifacts page.

Ideally, this specification configuration could be more automated in a FHIR IG Build, but for now what we have works. It's just a bit of a pain.

Wednesday, June 9, 2021

The Interop needle in 2030

ONC has been asking about Health Interoperability Outcomes for 2030. Some sample statements they'd like the answers to include:

“Because of interoperability, ______ before/by 2030.”
“Because of interoperability, before/by 2030 [who] will [what].”

In Why We'll Never Have Interoperability, I note that the goal posts are always moving, bars are ever being raised, the needle just keeps going around and around. So I thought it would be interested to look at this from the viewpoint of "what will we be concerned about" in 2030 as the next steps, rather than focusing on what we've accomplished.

Given all of that, these are the problems I think we will we still be trying to solve:

Making challenging content understandable to the average patient.
Making standards of care (e.g., care guidelines) measurable and computable.
Understanding the actual cost of care.
Crossing domain boundaries (care, payment, social services, public health, emergency medical services, research). Each of these domains is still widely separated with respect to standards
Few of these domains have progressed as far with APIs as simple as FHIR.
Handling variations in dialects of FHIR as supported by different vendors.

And these are the problems that I think we will have made headway on:

FHIR will have become ubiquitous in hospital and ambulatory practice interfaces between healthcare systems and devices. It will become available in not just EHR systems, but also departmental, laboratory, medical devices, revenue cycle, and to some degree, imaging systems (although DICOM will retain significant dominance).
Patient facing APIs will be ubiquitous.
Visit scheduling, and much of the pre-visit "paper-work" will be done via the web for most patients.
FHIR will have crossed into the payer space, and HIPAA transaction standards, invading some of the territory previously owned by X12 and NCPDP.
We will start to see FHIR transition into other healthcare related domains (e.g., EMS reporting, social services), but adoption will be limited.

Monday, May 24, 2021

A Source Code Profiler for FHIRPath

I wrote my first source code profiler in 1992, mostly as an exercise to play with but also because back then I didn't have the money for a professional one, and wrote about it in Dr. Dobb's Journal. That was a sampling profiler, and used the timer interrupt on a PC to determine where in the code the application was running. Since then, I've had to write my own profiler several times professionally for different reasons and purposes. Once I wrote some code to instrument some BASIC code and profile it. That code was running a web site for a major hardware vendor's web store (having been flown onsite to address a critical sporadic outage). While the profiler didn't show what was causing the outage, it did demonstrate that the application our company had developed wasn't the source of the problem, and I later DID determine what the cause was and we fixed it. I was given my first handheld computing device as a reward for that save. And I've been called to develop profiling tools for other custom languages and application implementations as well.

The focus of this week's FHIR Connectathon for SANER was performance and scaling. We finished up at noon yesterday, but I was already wire to the rim with caffeine, so I looked for something else interesting to do. As a result, I instrumented my version of the FHIR Path engine (it's basically a copy of the HAPI engine with some tweaks for use as a SANER Measure Computer) to profile the Measure code.

There are three places in SANER where a FHIRPath is evaluated to compute a measure:

Measure.group.population which produces the counts for the population.
Measure.group.stratifer which produces the strata for each element in the population.
Measure.supplementalData which specifies how to collect supplemental data for each element in a population.

Each of these may have an associated FHIRPath expression, and of course, I want to know how the expression performs.

The FHIRPath engine produces a parse tree of ExpressionNode items. The tree is walked by FHIRPathEngine.evaluate functions. What I did was simply added some code to trace the execution times in those functions in my FHIRPath engine code.

The core class and methods to support tracing in my engine is below. Note, this is hacked Proof-of-Concept code I wrote in about an hour. Documentation and everything else is coming.

class Trace {

String expr;

long start, end, children;

ExpressionNode node;

int parent;

int index;

private Trace(String expr, int index) {

this.expr = expr;

this.index = index;

this.start = System.nanoTime();

this.parent = -1;

Object o = new Object();

o.toString();

}

private Trace(int parent, int index, ExpressionNode node) {

this.parent = parent;

this.expr = (parent == -1 ? null : traces.get(parent).expr);

this.index = index;

this.node = node;

this.start = System.nanoTime();

}

public void done() {

this.end = System.nanoTime();

long childTime = end - start;

top = this.parent;

if (top != -1) {

traces.get(top).children += childTime;

}

public void done(ExpressionNode node) {

this.node = node;

done();

}

public String toString() {

StringBuilder b = new StringBuilder();

if (parent == -1) {

b.append(",,,,,,,\"").append(expr == null ? " " : expr.toString().replace("\"", "\"\"")).append("\"\n");

}

b.append("@").append(Integer.toHexString(expr.hashCode())).append(",").append(index).append(",")

.append(parent).append(",").append(start).append(",").append(end).append(",")

.append(end - start - children).append(",\"").append(node.toString().replace("\"", "\"\""))

.append("\"");

return b.toString();

}

public void setExpression(String expression) {

this.expr = expression;

}

public void dump(File f, boolean append) throws IOException {

try (FileWriter w = new FileWriter(f, StandardCharsets.UTF_8, append); PrintWriter pw = new PrintWriter(w);) {

pw.println("Node,Expr,Index,Parent,Start,End,Time,Expression");

for (Trace t : traces) {

pw.println(t);

}

traces.clear();

}

private Trace trace(String path) {

if (!TRACE) {

return dummy;

}

Trace trace = new Trace(path, traces.size());

traces.add(trace);

top = traces.size() - 1;

return trace;

}

private Trace trace(ExpressionNode node) {

if (!TRACE) {

return dummy;

}

Trace trace = new Trace(top, traces.size(), node);

traces.add(trace);

top = traces.size() - 1;

return trace;

}

Sprinkled in key places in functions that actually execute the profiling code are the statements (in a slightly larger font) in the code below:

public List<Base> evaluate(Base base, ExpressionNode expressionNode) throws FHIRException {

Trace trace = trace(expressionNode);

try {

List<Base> list = new ArrayList<Base>();

if (base != null)

list.add(base);

log = new StringBuilder();

return execute(new ExecutionContext(null, base != null && base.isResource() ? base : null,

base != null && base.isResource() ? base : null, base, null, base),

list, expressionNode, true);

} finally {

trace.done();

}

What goes on is that the call to trace(expressionNode) creates a new timestamped execution record to a stack of records which also points to the record of the calling method [often at the top of the stack]. Then, when the function is finished, trace.done() simply adds the ending time stamp, accumulates the time of this call to an accumulator for child times in the parent record, and returns the stack pointer to the parent's record.

When a user of the engine calls the dump() method with a file, the execution data is dumped to that file in CSV format. Records look something like this:

Expr Index Parent Start End Time Expression

@63efb167 1 -1 611441841022000 611451511487500 4711600 findAll('Encounter', ...

@791ea926 4 3 611441845767800 611441845782600 13400 'Encounter'

@791ea926 5 4 611441845771000 611441845772400 1400 'Encounter'

@6344cac3 6 3 611441845783900 611441847333500 12700 including('subject', 'diagnosis', ...

@6344cac3 7 6 611441845785400 611441847322300 1533300 including('subject', 'diagnosis', ...

@27688320 8 7 611441845786100 611441845787600 1200 'subject'

@27688320 9 8 611441845786600 611441845786900 300 'subject'

@39126a23 10 7 611441845788000 611441845789200 1000 'diagnosis'

@39126a23 11 10 611441845788600 611441845788800 200 'diagnosis'

@35c9a3be 12 7 611441845789500 611441845790400 700 'reasonReference'

@35c9a3be 13 12 611441845789900 611441845790100 200 'reasonReference'

The expression identifies the unique expression from the parse tree being evaluated. The index and parent describe the evaluation tree (which may evaluate the same expression multiple times with different inputs). The start and end time reflect the timestamps (in nanoseconds from an arbitrary base). Time represents the elapsed time (again in nanoseconds). The expression gives the full text of the evaluated expression.

You can take this data into a spreadsheet, create a pivot table report, and come up with useful reports on those parts of the expression taking the longest, as shown below.

Expression Time (ms) Executions

onServers(%Base).exists() 9145.7723 282

onServers(%Base).where(... 405.8266 2

$total | $this 47.0435 94

%ReportingPeriod.start - 1 'year' 10.4663 423

I normalized times to milliseconds so that the result would be more meaningful. This shows that the retrieval of primary data takes about 9.1 seconds, and a secondary retrieval about 0.4 seeconds, while everything else is 50ms or less.

It also demonstrates some opportunities for optimization in the FHIRPath engine. For example the 10ms taken on the the multiple evaluations of the common sub-expression:

%ReportingPeriod.start - 1 'year'

This is likely more costly than it needs to be. Since the FHIRPath engine today does nothing with common subexpression elimination, the profile shows the cost associated with re-evaluating these in the parse. That's probably an opportunity for a whole other post.

Keith

Tuesday, May 18, 2021

SANER Scales

This week's HL7 Connectathon 27 SANER Track is all about scalability. For this Connectathon, we brought over 2 million clinical resources worth of test data for 2600+ patients, transported from Synthea's 100K patient COVID-19 test data set from Massachusetts into Cook County, Illinois, transposed in time to January of 2021 instead of March of 2020, across 15 hospital locations. Because these patients were transported from the past, they aren't vaccinated, don't have access to Remdesivir, and probably too many are taking HCQ, but it was a good enough set to test the scalability of SANER.

My chief concern was computing MeasureReport on a daily basis, overnight, and whether that was going to be a huge data load for hospitals. As it turns out, I'm able to put those concerns to rest (for me at least).

We computed 465 MeasureReport resources, one for each of 15 hospitals over the 31 days of January, using realistic hospital loads drawn from current statistics reported by the Illinois Department of Health.

Each measure report communicated around 240 (average) supplemental data elements (FHIR Resources) providing additional data to support stratification and analytics, which about 40 times what would actually be needed if just communicating metrics.

All told, this represented about 465Mb of uncompressed, pretty printed FHIR Resources in XML format, or about 23Mb of data compressed using GZIP.

Best, yet, I was able to collect data from the cloud, compute measures, store them locally and transmit all the data for all days for all hospitals to a server in the Cloud in about 11 minutes on a pretty high-end Dell Laptop (6 cores, 3.6Ghz Burst, 32Gb of RAM).

I've still got some bugs to look into which might slow things down once fixed (mostly on stratification), but with 12 virtual processors running, this load barely touched my machine. Overall, CPU utilization was at a pretty steady 20%, and network bandwidth also nowhere near saturated. My home office gets about 150-200Mb down, 20Mb up, I barely touched it.

I can process the data for a single hospital day in 10-20 seconds depending on the number of patients. It's realistic to assume that more frequent, semi-real-time situational awareness measure evaluation and reporting is not only feasible, but also practical.

Most of the measures we have examined are written in a form that supports daily computation. We'll probably have to experiment with measures designed for more frequent evaluation.

Keith

* We keep hemming and hawing about near-real-time measures, and I've finally decided to call them semi-real-time, to clarify that they could be several minutes out of date, but still orders of magnitude better than daily. With enough concentration, semi-real-time could in fact become near-real-time (so long as the data sources themselves are frequently updated).

After doing some more tweaking I'm actually:

Overwhelming my server so hard it requires a restart to come back to life. I really need to get my server set up in a production ready way.
Running a bit slower but getting more data (so now it's taking about 28 second a hospital on average).

Tuesday, May 11, 2021

Tracking Supplies via SANER

One of the outstanding challenges yet to be demonstrated at a Connectathon is the ability of SANER to report on supplies.

Some common examples include N95 Respirators (a.k.a. Surgical Masks), Ventilator supplies (e.g., tubing, connectors, et cetera), gloves, gowns and cleaning supplies.

How would you track these materials in ways that would enable an inventory system to track them?

Honestly, this is an area that I've spent very little time with, but is one focus area I plan to study for the upcoming May HL7 Connectathon.

What I already know:

Just about anything sold as a medical device in the US has an FDA Classification code.
Many medical devices and packages of them will have a GUDID identifier.
GUDID relies on UDI specifications.
GUDID classifies devices via both the FDA Product Classification code and using Global Medical Device Nomenclature (GMDN) codes.
Class 1 devices can be identified with a UPC code, some have a GUDID identifier.
Inventory management systems often use GTIN identifiers (of which, UPC is one type)
The HL7 Structured Product Label (SPL) standard has been adopted by the FDA for both Medication and Medical Devices.
SPL references vocabulary which can also describe the device.

All of this knowledge is captured, registered, and publicly available. Some possibly for purchase, some freely accessible, some even with an API.

So, if you have UPC numbers or GMDN codes, or GUDID data, you might actually be able to create a value set that you can get to from the item codes used to describe supplies in the hospital inventory control systems.

Monday, April 19, 2021

Make Vaccine Scheduling "Scheduler Friendly"

Every family I know has (at least) that one person who has to have the planning calendar, and that other person who keeps track of all the important documents, and that other person that they call on for healthcare related stuff, and finally, the computer geek. And they may all reside in the same person. One of these is very likely the COVID-19 vaccine scheduler.

As I think about how vaccines are opening up, and my own experience in Massachusetts scheduling vaccines for my family, here are some of my experiences:

I have to enter the same insurance data each time for each different person I'm making an appointment for. If only there was a standard for the layout and OCR font for insurance cards, or better yet, even a standard bar-code or QR format for insurance information, it could have made my life so much easier.
I could never schedule more than one person at the same time, even if there are two or three qualifying individuals that I need to schedule for at the same time (and appointments open). This resulted in me making 2 or 3 different appointments for a two groups of people who each had to travel over 30 minutes to a total of 5 different locations during two different enrollment periods. In one case, I fat fingered the first appointment date, which meant I had to reschedule one of the appointments, which led to a three week delay in getting a replacement appointment.

I've seen six different scheduling interfaces (four for drug-stores, two for my state public health sites), not one of them is really designed for the person in the family who does the scheduling for most of the family members. These same changes could readily enable others who volunteer to assist others in scheduling work more efficiently.

There are balancing factors. Making it easy for one person to schedule multiple appointments at the same time and location would benefit families, but single individuals living alone would be disadvantaged by such a system. But if there are enough vaccines (and appointments) to go around, this would be less of a problem.

We're likely going to be scheduling shots for some time yet. We've only gotten shots into the arms of about half of the US population, and these aren't likely to be the last COVID-19 shots that have to be given. Booster shots are expected by some vaccine manufacturers.

Monday, April 12, 2021

Recovering Deleted Data in an Excel Tab (When you have a PivotTable)

Life happens sometimes. Is this you?

You leave multiple programs running because you have a few meetings to get through for the day.
You hope to get back to that spreadsheet you started working on in the early morning.
You operate Excel pretty well using the keyboard, and touch type in it and other programs.
Eventually, you get to the next day, discover you never got back to the spreadsheet, and so close and save it promising you will finish it up later that day.
You reopen it later and find out the tab you were working on was missing.
Because you downloaded it and never turned on "version control", you don't have the old version.
Or a copy of what you worked on (and the changes are to recently made to have a backup).
But it DOES have a pivot table summarizing the data in your missing tab.
Somewhere during your calls you hit both Delete and Enter at the same time and managed to delete the Excel sheet b/c you were typing and talking at the same time, and paying attention to the Zoom call, and didn't see that you'd deleted the tab you just spend hours working through.

This post may save your data, but only if you are a bit of a computer geek and know a little something about XML. Using the techniques below managed to save mine.

What you may not know is that:

Later version of Microsoft Office tools, including Excel use the ZIP format to store the collection of XML files they use to store data in the new formats.
Whenever you create a Pivot Table in Excel from data in your sheet (or from other data sources), Excel caches a snapshot of the data for local processing, and only refreshes from the original source when you Refresh the Pivot Table. If you DON'T Refresh the Data in the Pivot Table, it's still in your spreadsheet.

Here is what you need to do:

Make a copy of the affected file in a new work folder.
If the copied file is in .XLS format, open it in Excel, and Save as a .XLSX. This will simply change the file format, the data you need will still be in it, but now in a format that can be more readily accessed.
Rename the file from *.XLSX to *.ZIP.

Next, look for xl/pivotCache/pivotCacheDefinition1.xml and xl/pivotCache/pivotCacheRecords1.xml in the ZIP file. If you have more than one Pivot Table, you might need to look at the files ending in a different number.

The pivotCacheDefinition file contains the column heading names in the cacheField element.

You can verify the data source in the worksheetSource element.

The pivotCacheRecords file contains the rows of the sheet in the <r> elements.

Empty cells are reported in <m/> elements.

The values are found in the v attribute of elements named <b> (boolean), <d> (date), <e> (error), <n> (numeric) and <s> (string).

Some elements where the values are repeated a lot use <x>, in which case the v attribute indicates the index of the sharedItems found in the cacheField element in the pivotCacheDefinition file. This gets a little bit complicated.

Assuming you've extracted those two files, the following XSLT will regenerate the sheet in CSV format when run over the extracted pivotCacheRecords.xml file.

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

exclude-result-prefixes="xs"

xmlns:m="http://schemas.openxmlformats.org/spreadsheetml/2006/main"

version="2.0">

<xsl:output method="text"/>

<xsl:variable name="doc" select="doc('pivotCacheDefinition1.xml')"/>

<xsl:template match='/'>

<xsl:for-each select='$doc//*:cacheField/@name'>

<xsl:if test='position()!=1'>,</xsl:if>

<xsl:value-of select='.'/>

</xsl:for-each>

<xsl:text>
</xsl:text>

<xsl:for-each select='.//m:r'>

<xsl:for-each select='*'>

<xsl:if test='position()!=1'>,</xsl:if>

<xsl:variable name="pos" select="position()"/>

<xsl:choose>

<xsl:when test="self::m:m"></xsl:when>

<xsl:when test="self::m:x">

<xsl:variable name="v" select="@v + 1"/>

<xsl:variable name="x" select="$doc//m:cacheField[$pos]/m:sharedItems/*[$v]/@v"/>

<xsl:if test='$x'>"</xsl:if>

<xsl:value-of select="replace($x,'"','""')"/>

<xsl:if test='$x'>"</xsl:if>

</xsl:when>

<xsl:otherwise>

<xsl:text>"</xsl:text>

<xsl:value-of select="replace(@v,'"','""')"/>

<xsl:text>"</xsl:text>

</xsl:otherwise>

</xsl:choose>

</xsl:for-each>

<xsl:text>
</xsl:text>

</xsl:for-each>

</xsl:template>

</xsl:stylesheet>

Pages

Friday, December 3, 2021

Saturday, November 27, 2021

Thursday, October 14, 2021

Tuesday, September 28, 2021

Friday, August 20, 2021

Tuesday, August 3, 2021

Monday, August 2, 2021

Monday, June 28, 2021

Wednesday, June 16, 2021

Wednesday, June 9, 2021

Monday, May 24, 2021

Tuesday, May 18, 2021

Tuesday, May 11, 2021

Monday, April 19, 2021

Monday, April 12, 2021