Tuesday, May 11, 2021

Tracking Supplies via SANER

One of the outstanding challenges yet to be demonstrated at a Connectathon is the ability of SANER to report on supplies.

Some common examples include N95 Respirators (a.k.a. Surgical Masks), Ventilator supplies (e.g., tubing, connectors, et cetera), gloves, gowns and cleaning supplies.

How would you track these materials in ways that would enable an inventory system to track them?  

Honestly, this is an area that I've spent very little time with, but is one focus area I plan to study for the upcoming May HL7 Connectathon.

What I already know:

All of this knowledge is captured, registered, and publicly available.  Some possibly for purchase, some freely accessible, some even with an API.

So, if you have UPC numbers or GMDN codes, or GUDID data, you might actually be able to create a value set that you can get to from the item codes used to describe supplies in the hospital inventory control systems.

Monday, April 19, 2021

Make Vaccine Scheduling "Scheduler Friendly"

Every family I know has (at least) that one person who has to have the planning calendar, and that other person who keeps track of all the important documents, and that other person that they call on for healthcare related stuff, and finally, the computer geek.  And they may all reside in the same person.  One of these is very likely the COVID-19 vaccine scheduler.

As I think about how vaccines are opening up, and my own experience in Massachusetts scheduling vaccines for my family, here are some of my experiences:

  1. I have to enter the same insurance data each time for each different person I'm making an appointment for.  If only there was a standard for the layout and OCR font for insurance cards, or better yet, even a standard bar-code or QR format for insurance information, it could have made my life so much easier.

  2. I could never schedule more than one person at the same time, even if there are two or three qualifying individuals that I need to schedule for at the same time (and appointments open).  This resulted in me making 2 or 3 different appointments for a two groups of people who each had to travel over 30 minutes to a total of 5 different locations during two different enrollment periods. In one case, I fat fingered the first appointment date, which meant I had to reschedule one of the appointments, which led to a three week delay in getting a replacement appointment.
I've seen six different scheduling interfaces (four for drug-stores, two for my state public health sites), not one of them is really designed for the person in the family who does the scheduling for most of the family members.  These same changes could readily enable others who volunteer to assist others in scheduling work more efficiently.

There are balancing factors. Making it easy for one person to schedule multiple appointments at the same time and location would benefit families, but single individuals living alone would be disadvantaged by such a system.  But if there are enough vaccines (and appointments) to go around, this would be less of a problem.

We're likely going to be scheduling shots for some time yet.  We've only gotten shots into the arms of about half of the US population, and these aren't likely to be the last COVID-19 shots that have to be given.  Booster shots are expected by some vaccine manufacturers.

Monday, April 12, 2021

Recovering Deleted Data in an Excel Tab (When you have a PivotTable)

 Life happens sometimes. Is this you?

  1. You leave multiple programs running because you have a few meetings to get through for the day.
  2. You hope to get back to that spreadsheet you started working on in the early morning.  
  3. You operate Excel pretty well using the keyboard, and touch type in it and other programs.
  4. Eventually, you get to the next day, discover you never got back to the spreadsheet, and so close and save it promising you will finish it up later that day.
  5. You reopen it later and find out the tab you were working on was missing.
  6. Because you downloaded it and never turned on "version control", you don't have the old version.
  7. Or a copy of what you worked on (and the changes are to recently made to have a backup).
  8. But it DOES have a pivot table summarizing the data in your missing tab.
  9. Somewhere during your calls you hit both Delete and Enter at the same time and managed to delete the Excel sheet b/c you were typing and talking at the same time, and paying attention to the Zoom call, and didn't see that you'd deleted the tab you just spend hours working through.
This post may save your data, but only if you are a bit of a computer geek and know a little something about XML. Using the techniques below managed to save mine.

What you may not know is that:
  • Later version of Microsoft Office tools, including Excel use the ZIP format to store the collection of XML files they use to store data in the new formats.  
  • Whenever you create a Pivot Table in Excel from data in your sheet (or from other data sources), Excel caches a snapshot of the data for local processing, and only refreshes from the original source when you Refresh the Pivot Table.  If you DON'T Refresh the Data in the Pivot Table, it's still in your spreadsheet.
Here is what you need to do:

  1. Make a copy of the affected file in a new work folder.
  2. If the copied file is in .XLS format, open it in Excel, and Save as a .XLSX.  This will simply change the file format, the data you need will still be in it, but now in a format that can be more readily accessed.
  3. Rename the file from *.XLSX to *.ZIP.
Next, look for xl/pivotCache/pivotCacheDefinition1.xml and xl/pivotCache/pivotCacheRecords1.xml in the ZIP file.  If you have more than one Pivot Table, you might need to look at the files ending in a different number.

The pivotCacheDefinition file contains the column heading names in the cacheField element.
You can verify the data source in the worksheetSource element.
The pivotCacheRecords file contains the rows of the sheet in the <r> elements.
Empty cells are reported in <m/> elements.
The values are found in the v attribute of elements named <b> (boolean), <d> (date), <e> (error), <n> (numeric) and <s> (string).
Some elements where the values are repeated a lot use <x>, in which case the v attribute indicates the index of the sharedItems found in the cacheField element in the pivotCacheDefinition file.  This gets a little bit complicated.

Assuming you've extracted those two files, the following XSLT will regenerate the sheet in CSV format when run over the extracted pivotCacheRecords.xml file.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    <xsl:output  method="text"/>
    <xsl:variable name="doc" select="doc('pivotCacheDefinition1.xml')"/>
    <xsl:template match='/'>
        <xsl:for-each select='$doc//*:cacheField/@name'>
            <xsl:if test='position()!=1'>,</xsl:if>
            <xsl:value-of select='.'/>
        <xsl:for-each select='.//m:r'>
            <xsl:for-each select='*'>
                <xsl:if test='position()!=1'>,</xsl:if>
                <xsl:variable name="pos" select="position()"/>
                    <xsl:when test="self::m:m"></xsl:when>
                    <xsl:when test="self::m:x">
                        <xsl:variable name="v" select="@v + 1"/>
                        <xsl:variable name="x" select="$doc//m:cacheField[$pos]/m:sharedItems/*[$v]/@v"/>
                        <xsl:if test='$x'>"</xsl:if>
                        <xsl:value-of select="replace($x,'&quot;','&quot;&quot;')"/>
                        <xsl:if test='$x'>"</xsl:if>
                        <xsl:value-of select="replace(@v,'&quot;','&quot;&quot;')"/>

Friday, April 2, 2021

From Risk to Opportunity

If you've been in healthcare or IT for a while, you've probably heard about Risk Assessments.  And if you have been through a few of these, you might recall the process of:

  1. Identifying assets to protect
  2. Enumerating threats to those assets in a list of risks
  3. Assessing the likelihood of the risk happening
  4. Assessing the level of impact of the risk (i.e., how bad it would be if the threat occurred)
  5. Using the categories values from #3 and #4 to assess the risk level (to understand what needs to be mitigated) using something like the matrices below:

Several resources are available from HL7 and IHE to do these tasks, including the HL7 Security Cookbook and the the IHE Risk Management in Healthcare IT Whitepaper (from which the two images above were drawn).

But I'm not really here to talk about Risk Assessment.  I'm going to the other end of the spectrum to talk about how you can use this same framework to prioritize efforts for opportunities, and it works pretty much the same way.
  1. Identify the assets to capitalize on (assets being used very loosely here, it could include processes and skills that you are good at, as well as the usual notion of assets).
  2. Identify the value of those assets in a list of opportunities.
  3. Assessing the likelihood of the opportunity succeeding.
  4. Assessing the level of impact of the opportunity (e.g., ROI).
  5. Assessing the importance of the opportunity.
The same way a risk assessment helps to identify the risks to mitigate, an opportunity assessment can help you identify opportunities to explore further, and skip the ones which are of lower importance.  You could also replace "likelihood" with "cost", where cost is essentially a proxy for likelihood, but high cost is equivalent to low likelihood, and so you'd have to flip one axis of the grid.

The number of levels of opportunity (or risk for that matter) that one puts in the grid is really up to the organization performing the assessment, I'd recommend using at least 3 and no more than 5.

The benefit of using this framework for assessing opportunities is very similar to the befit for following it for risk assessments.  It puts a structure around the work that you are doing, and adds some degree of objectivity to the assessment process.  It still requires judgement (and that may be subjective), but the results will give you more insight and confidence in the outcomes.

For what it's worth, this isn't my idea.  I THINK I first heard about this from Gila Pyke about seven or eight years ago, probably over Sushi somewhere at an HL7 or IHE meeting.

Friday, March 26, 2021

Happy Birthday to The SANER Project

Three days ago was the one-year anniversary of the public inception of The SANER Project. It's hard for me to believe how far we've come.  The project is still going strong, but we are now in the "uninteresting" stages of development.  We've identified what we need to do with all of the HL7 Ballot comments, and now are starting the task of making the edits in order to publish as a Standard for Trial Use.  I expect we have another month of work to get that done.  The next step gets more interesting, as we move past proof-of-concept testing and into deployment in pilots.

Connectathon participation is a routine part of our development efforts, and we've been planning for the next HL7 Connectathon, having recently completed participation in the IHE North American Connectathon.  In May, we anticipate being able to provide a month of sample test data to enable testing of actors at real-world volumes for an entire county's worth of hospitals.

Major EHR vendors Epic and Cerner participated in the last testing event, as did others, including the CDC.  SANER has gotten attention from several public health agencies or health information exchanges in several states, including Texas, where I am supporting efforts in an ONC STAR HIE Grant, and in Arizona where another Keith is driving the efforts in IHE.

The Arizona interest in the FHIR SANER IG showed up in my inbox the other day.  If someone were to tell you that a guy named Keith working for an HIE proposed an IHE White Paper to the Quality, Research and Public Health workgroup using the SANER IG, you wouldn't be surprised.  But as soon as you found at that it wasn't me, you might be (it was another Keith).  So, it's worthy of note.  I had nothing to do with the proposal, but you can certainly bet I'll be involved in the effort.

Next week (tentatively), my colleague Lauren Kneiser and I will be presenting on SANER to HITAC. That's a pretty significant milestone from my perspective. We're also presenting a recorded session available on demand for the Preparedness Summit in April, something like the fifth or six presentation we've given at a national event.

The work of The SANER Project continues, and what I've learned from this project is influencing additional standards development in HL7.  The biggest thing I've learned from SANER is that NOT every thing done with FHIR needs a profile.  Some things simply need a framework for people to describe (in FHIR), the data that they want, so that systems can collect and find it.


Friday, February 19, 2021

Load Testing is the Most Expensive Effort You Will Regret Not Doing

Photos of the I-5 Skagit River Bridge

Time and time again I’ve seen major failures in information systems that fail under real load.  This happens over and over again. It happened again today in my home state when the state’s vaccine scheduling system failed shortly after being opened up for scheduling.  I tried using it at least five different times today to schedule myself and my wife.  I finally gave up, realising that the system simply was under way too much strain.  Thirty second page loads, timeouts,  mystic error messages never meant to be exposed to an end user, it’s all classic symptoms of a system under way more load that it can handle.

The solution is very simple, you have to test systems under the load that they are designed to be used.

The problem is, that’s one of the most expensive tests software developers perform.  It can take a month and more just to prepare.  A big part of that is simply getting enough synthetic data to test with.  The data has to be good enough to be used with your validation checks.  The system has to be designed so that you can run such a test at scale without committing other systems to act on the test data.  It’s hard, it’s expensive, and often by the time you are ready to load test, the project is already late, and product needs to ship, or at least that’s what management says.

And so, against better judgement, the product ships without being tested.  And it fails.  Badly.

Somewhere, a resource gets locked for long than it should, and that causes contention, and the system slows to an unusable crawl.  It’s an unneceaary table lock in a database.  A mutex on a singleton in the runtime library.  A semaphore wrapped inappropriately around a long running cleanup operation.  Diagnosis takes days, and then weeks.  Sometimes a critical component needs to be rearchitected, or worse, replaced.  At other times, the fix is simple, once you finally find the error.  And sometimes there’s a host of optimizations that are needed.  What would have delayed the delivery of the product by weeks now delays it by months... or even years.  Some never recover.

Often, the skills necessary to engineer the system to a sustained load are simply absent, to the point that the expected loads and response times were never actually provided as design inputs. Nobody computed them, b/c the system is new, and nobody knew how to without any real world experience.

Load testing is the most engineering intensive effort of “software engineering”.  It’s the kind of effort that distinguishes a true software engineer or system architect from a “computer programmer”.  

Discovering that your system won’t operate under the expected load is not the worst thing that can happen to you.  Doing so after you’ve “gone live” is.  Now you have three expensive efforts simultaneously to address:

  1. The political fallout of the disaster to manage.
  2. The recovery effort on the broken data streams created by the failed system.
  3. The load testing you should have done in the first place, including the accompanying remediation of issues found from that effort.
Don’t ever skip load testing, at least if you want to continue to call yourself a software engineer.

Friday, February 12, 2021

Enhancing Search capabilities in HAPI using SUSHI and SearchParameter

I've been using HAPI on FHIR for several years across multiple projects, sometimes with my own back end for storage, and at other times using the vanilla HAPI JPA Server with a database back end.  One of the features of the JPA Server is that you can enhance search capabilities by telling the server how to implement an enhanced search capability by creating a SearchParameter resource on the server.

The FHIR SANER Implementation Guide defines several SearchParameter resources.  The simplest of these enables searching by Encounter.hospitalization.dispositionCode and is a good starter example for those who just need to search on a single field value for a resource where the core FHIR specification doesn't define a search capability.  A more complex example can be found in the Search by Code which enables search on several resource types for a single field.  While technically correct, this example is more complex than HAPI JPAServer can handle, and I'll talk later in this post about how it might be simplified to enable JPAServer to handle it.

If you are defining a SearchParameter resource for an HL7 or IHE Implementation Guide, the first thing you need to do is specify a bunch of metadata associated with the resource.  If, like me, you have to do this for a number of different resources, SUSHI has a syntax that enables you to effectively create a macro, a set of instructions that can be included in any resource definition.

The set of instructions I use is:

RuleSet: SanerDefinitionContent
 * status = #draft      // draft until final published
 * experimental = true  // true until ready for pilot, then false
 * version = "0.1.0"    // Follow IG Versioning rules
 * publisher = "HL7 International"
 * contact[0].name = "HL7 Public Health Workgroup"
 * contact[0].telecom.system = #url
 * contact[0].telecom.value = "http://hl7.org/Special/committees/pher/index.cfm"
 * contact[1].name = "Keith W. Boone"
 * contact[1].telecom.system = #email
 * contact[1].telecom.value = "mailto:my-e-mail-address"
 * jurisdiction.coding = http://unstats.un.org/unsd/methods/m49/m49.htm#001

We'll change status to #active, experimental will be set to false, and the version will be updated when we publish as a DSTU.  We follow the HL7 conventions for the first contact (the web page for the workgroup responsible for publishing the IG), and I add my contact information as the editor.  The jurisdiction.coding value is set to the value commonly used for Universal guides.

Other metadata you have to create describes the specific search parameter.  I put that directly in the SearchParameter instance:

Instance: SearchParameter-disposition
InstanceOf: SearchParameter
Title: "Search by hospitalization.dispositionCode in Encounters"
 * insert SanerDefinitionContent
 * url = "http://hl7.org/fhir/uv/saner/SearchParameter/SearchParameter-disposition"
 * description = "This SearchParameter enables query of encounters by disposition to support automation of measure computation."
 * name = "disposition"
 * code = #disposition

The instance name should start with "SearchParameter-", and should be followed by the value of the name you are going to use for search.  This is what you would expect to appear in the query parameter.  For this example, the search would look like "GET [base]/Encounter?disposition=...", so we use disposition as both the name and the code for this parameter.  I'd recommend keeping the name and code the same.

Finally, you need to specify some of the technical details about the Search parameter.  For this example, it applies only to the Encounter resource, and operates against a code.  Here are the additional settings you would need to specify:

 * base[0] = #Encounter
 * type = #token
 * expression = "hospitalization.dispositionCode"
 * xpath = "f:hospitalization/f:dispositionCode"
 * xpathUsage = #normal
 * multipleOr = true
 * multipleAnd = false

  1. The base parameter allows you to identify the resources to which the search parameter applies.
  2. The type parameter indicates which kind of search parameter type to support.
  3. The expression parameter describes how to find the data used for the search in the FHIR Resource using FHIRPath.
  4. The xpath parameter describes how to find the data used for the search using an XPath expression over the FHIR XML.  For the most part, this is simply the same as expression, using different syntax.
  5. Generally, you want to leave xpathUsage set to #normal as above.
  6. If you want the parameter to be repeatable using a logical or syntax, set multipleOr to true.  For this case:
    GET [base]/Encounter?disposition=01,02 would mean: Get all dispositions where the code value is 01 or 02.
  7. If you want the parameter to be repeatable using And, set multipleAnd to true, otherwise set it to false.  If the field in the resource is not repeatable, you can very likely leave this set to false.
For more complex search parameters, you can add multiple resource types for the base parameter.  We did that in the SANER IG to support search by codes in Measure and MeasureReport.

Here's an example of how we did that, but there are some caveats which I'll get into below:

 * code = #code
 * base[0] = #Measure
 * base[1] = #MeasureReport
 * type = #token
 * expression = """
 descendants().valueCodeableConcept | descendants().valueCoding | descendants().valueCode | code | descendants().ofType(Coding).not().code

 * xpath = """
 descendant::f:valueCodeableConcept | descendant::f:valueCoding | descendant::f:valueCode | f:code | f:descendant::f:code[ends-with(local-name(..),'oding')]
* xpathUsage = #normal
* multipleOr = true
* multipleAnd = true

Technically, the above expression and xpath are accurate.  However, this search parameter appears to be too complex for HAPI JPA Server to handle.  I expect that is because it operates over a simplified syntax of either the XPath or expression content, so I will be digging into that.  I think the ofType() or [ends-with...] expressions might be causing problems.  There are other ways that I can make these more explicit which would likely work better.