Tuesday, January 26, 2021

Rethinking Vaccination Scheduling

Thinking about getting shots into arms, scheduling and planning and logistics for this.  There are a lot of resources that you need to keep track of.  Any one of these could be a rate limiting factor. 

  1. Vaccination supplies (Not just doses, but also needles, cleaning and preparation materials, band-aids and alcohol wipes)
    I'm simply not going to address this issue here.  This is logistics and situational awareness, and this post is thinking a lot more about managing and scheduling vaccination appointments.
  2. Space to handle people waiting to check-in
    If you use cars and parking lots and telephone / text messages for check-in, you can probably handle most of the space needs for check-in, but will not be able to support populations that do not have access to text messaging.  You might be able address that issue with a backup solution for that population depending on size.
  3. People to handle check-in
    The tricky bit here is that the people who handle check-in need to be able to see or at least communicate with the people administering vaccinations so that the queues can continue to move ahead, and see or communicate with people who have checked in and are ready to get their vaccination.  There are a lot of ways this can work, and much of it depends on the facility's layout.  A football stadium provides much more space and opportunity for handing this problem than the typical ambulatory physician office.  With a small enough volume, checkin can be handle by the administering provider, with larger volumes but the right technology, it could still be pretty easy.  
  4. Handling Insurance Paperwork
    The biggest challenge with check-in will be the whole insurance paperwork bundle.  Ideally, this could all be addressed before making the appointment.  Because while patients will pay nothing for COVID-19, Medicaid, Medicare and private insurers may still need to shell out to providers who do it.  A smart intermediary could address this by supporting some sort of mass vaccination protocol for logging patients to payers, and vaccinations back to "appointments" (see more on scheduling below).
  5. People to administer vaccinations
    Right now, the two vaccinations use intra-muscular injection and multi-dose vials for administration.  Anyone who's had a family member taking insulin, or having been through fertility drug treatment understands that for the most part, the injection part isn't brain surgery, and also doesn't take that long (I've dealt with both in my family).  This is probably the least rate limiting factor after vaccination supplies.
  6. Space to handle people who've had a vaccine administered, but still need to be monitored for an adverse reaction.  The two to three minutes it takes to administer a shot requires 10+ minutes thereafter and a minimum of 20 square feet of space for each person to whom it is administered to address potential adverse reactions (with 6' social distancing measures).  
  7. People to monitor adverse reactions
    I don't know what the right ratios are here, but one person could monitor multiple patients at a time.
  8. People to treat adverse reactions
    This is a different ratio and skill set than for #6 above.  The skillset to treat a problem is likely more advanced than to detect it, but you can probably only treat one adverse reaction at a time, and might need to plan for two or more just in case.

And then there's scheduling.
Scheduling for this within a single facility with existing systems in ambulatory or other practice environments would be rather difficult.  Most ambulatory appointment scheduling systems are broadly designed to handle a wide variety of cases, not designed for highly repetitive case loads like a mass vaccination campaign.  The closest thing in the ambulatory space for this is laboratory testing, where specimen collection is more "assembly line" oriented.

For tracking laboratory testing, it's less about the appointment, and more about the order in the ambulatory setting. The order is placed, and when the patient shows up, the specimen collection is done.  If we work on mass vaccination more like that, then scheduling could be a bit easier. The order basically grants you a place in line, but you still have wait in line until your turn. If you've ever tried to get a blood draw done during lunch hour, you may have been in this situation.  This seems like a better way to go for a mass vaccination campaign.

You no longer get an appointment for a 10-25 minute slot, but instead maybe get a day and possibly a 2-3 hour time period assignment that you are asked to show up within, but can use that slot any time after that point in time. The time period assignment is used to maintain the flow throughout the day, but it's more advisory than a typical ambulatory appointment slot is.

Regional Scheduling
The value of this in scheduling is that each facility can estimate a volume of patients to handle on any given day (or broad time period within a day) based on staffing and other resources.  These volumes can be fed into a system to support scheduling on a daily basis, which can then be used to broadly manage scheduling, not just within a facility, but perhaps even across a state.  If the scheduling system is also capturing insurance information, that could get more complex, but I think more realistically, the scheduling system can be used to feed data to vaccination sites, and the vaccination sites can follow-up with the patient out-of-band regarding insurance stuff using whatever system they have to handle that.  That's a more flexible approach.  This would mean maybe a 24-48 hour delay between scheduling and first appointment slot availability, but perhaps not, because it could be possible that some providers would also be able to set up web sites for patients to register their insurance details, and others might already have the necessary details.  I could use the state system to schedule appointments, and still go to my regular provider for the vaccination.  Providers participating in this might reserve some capacity for their regular patients.

If other service industries can handle scheduling in a 2-3 hour block range to manage their workloads, maybe we can apply this technique to scheduling for a mass vaccination program.  We don't need a precise schedule, we need a sustained rate and flow.  In any case, it's worth thinking about.

This is all just a thought experiment, I'm not going to say it will work, or even that I've convinced myself that it has some value.  It just seems to address one of the key problems in getting shots into arms, which is getting the arms to where the shots can be placed.


Friday, January 22, 2021

Situational Awareness and Disease Surveillance

There's broad overlap between Disease Surveillance efforts and Situational Awareness reporting.  If you look back to early March, former National Coordinator Farzad Mostashari illustrates on twitter the use of ILI reporting systems to support COVID-19 Situational Awareness.  Surveillance efforts abound: Biosurveillance, ELR, ECR, ILI, reportable/notifiable conditions, et cetera.

Surveillance efforts can be classified a couple of different ways, at the very least a) what you are looking for, and b) how you respond to that event.  You are either looking for a known signal (e.g., ILI, or a reportable/notifiable condition), or simply a deviation from a normal signal (e.g., biosurveillance, and to some degree ILI).  You can (especially true for known signals), trigger a predefined intervention or response , or "investigate", or simply communicate the information for decision making at various levels.  COVID-19 dashboards which show hospital / ventilator capacity are often used to support various kinds of decisions (e.g., support and supply), as well as to communicate risk levels to the public.

If you think about such efforts as reporting on bed capacity or medication usage related to COVID-19, you need to be able to a) check lab results and orders, b) evaluate collections of patient conditions (e.g., to detect suspected COVID-19 based on combinations of symptoms) and c) examine medication utilization patterns.  All of this can also be used to support various kinds of surveillance efforts.

Surveillance goes somewhat deeper than situational awareness.  The most common case is turning a positive signal for identifying a case into a case report for follow-up investigation, as in Electronic Case Reporting.  This goes beyond basic Situational Awareness reporting.  Case reporting can get rather involved, going deep into the patient record.  Where SA efforts are more aligned is when the initial data needed (e.g., as for an Initial Case Report) is fairly well defined.  For that, we have been defining mechanisms whereby the supplemental data reported in the measure can also be used to support those sorts of efforts.

The overlap between these efforts points to one thing, which is a general need to address the inefficiencies of multiple reporting efforts to public health.  The various reporting silos exist because of a fractured healthcare system, fractured funding, and the various legal and regulatory challenges for consolidating public health data.  There's NO quick fix for this, it will likely take years to get to a consolidation of methods, standards and policies across public health reporting initiatives, but it's something that's worth looking into.

Thursday, January 21, 2021

Normalizing FHIR Queries using FluentQuery

One the advantages of declarative specifications is that they tell a system what needs to be done without specifying how to accomplish it.  When I created FluentQuery for the SANER IG, it had a lot more to do with making FHIR Queries clearer, rather than providing a declarative form for query specifications, but it effectively does create a declarative form (because FHIRPath is a declarative language).

My initial implementation of this form is rather stupid, it just builds up the appropriate FHIR Query string to accomplish what has been asked for.  After going through Connectathon last week, we learned more about the differences in end-points, e.g., a FHIR Server vs. an EHR system implementing 21st Century Cures APIs, and I wound up having to "rewrite queries" to support the simpler syntaxes supported by the EHR.  This was completely expected, but what I hadn't realized beforehand, was that I could actually automate this process in the FluentQuery implementation itself.  

What I'd accidentally done by creating FluentQuery was enabled interoperability across varying FHIR Implementations so that a FHIR Path interpreter implementing FluentQuery could allow a user defined query to be implemented partially or fully by the target FHIR Server, with the rest of the filtering being handled by the receiver.  Doing so in such an implementation would allow a MeasureComputer actor to adapt to servers which are being queried based on their CapabilityStatement resources or other known factors within an implementation.

Let's look at some examples, starting with including().  Here's a query using including:

findAll('Encounter',including('subject','diagnosis','reasonReference'), …).onServers(%Base)

My simplistic implementation would write this as, and execute it using the resolve() function of FHIRPath.  My implementation of resolve already does some additional work here, addressing collection of all result pages when there is more than one page of results.


If the server supports includes, but does not support specific kinds of includes, the query can be written with _include=Encounter:*, and the resulting included references in the resulting Bundle can be filtered out of the results after the fact by removing those that aren't necessary after collecting all pages.

If the server does not support includes, then the resulting Bundle can be post-processed to call resolve() in the elements that should have been included, as if it was written:

   .select(Encounter | Encounter.subject.resolve() | encounter.diagnosis.condition.resolve() |

I've got several queries that work with ValueSet resources such as this one below:

findAll('Observation', for('patient', $this.id), with('code').in(%Covid19Labs) )

If the server supports value set queries, the query would be written as:


But, if it does not support value set queries, but does support multiple parameters, and the value set is small (e.g., less than 10 values), it can be rewritten as:


And if it does not support multiple values in code, then it can be rewritten as multiple queries, and the resulting Bundles can be merged:


And finally, if it does not support the query parameter at all, it can be post-filtered as if it was written:

findAll('Observation', for('patient', $this.id)).select(Observation.where(code.memberOf(%Covid19Labs))

FluentQuery turns out to be a lot more powerful than I had expected.  A dumb implementation is VERY easy to implement, taking about 500 lines of Java.  The findAll and whereExists method write the query expression by writing out a ?, the resource name, and the remaining arguments to the function separated by & characters.  Each individual function argument writes the appropriate query expression where with() operates by simply writing out the field name, and the various comparison expressions concatenate that with the appropriate comparison operations.  And onServers simply prepends the server names to the query and calls resolve().

To change that to support an adaptive implementation, I would instead write out each comparison expression as a collection of Parameters (using the Parameters resource), and then parse the list of parameters in onServers based on the CapabilityStatement resource for the servers given in arguments to it to execute the appropriate FHIRPath expression.

In fact, I expect I might even compile some small FHIRPath expressions in onServers() that are associated with the Parameters resource used to memo-ize the query operation.  In HAPI, I can probably store the the compiled expressions in the Parameters resource as User data.  If I wanted to get fancier, I could even rewrite the compiled expression.

The rest is, as a former colleague of mine liked to say, a simple matter of programming.

Tuesday, January 19, 2021

Automating FHIR IG Builds From Structured Data

This is likely to be a two or three part series.  Over the last year I've worked on four different FHIR Implementation Guides:  SANER, Mobile Health Application Data Exchange Assessment Framework, and V2 to FHIR. and the IHE ACDC Profile.  Each of these suffers from the "Lottery Problem" in that if I win the lottery and decide to retire, I'm going to have to train others how to run the tool chain that builds that content that the IG Publisher uses to generate the IG.

I hate having the lottery problem, because it puts me into the position of being the one who has to both maintain the toolchain, and run the builds when something important changes.  I can't get away with not maintaining the toolchain, but what I can do is make it a lot simpler so that it's obvious for anyone how to run it, and I've been working through that process as I advance the tools I'm using.  These days, my most recent project (SANER) only relies on Saxon's XSLT transformer and PLANTUML, which is better than a lot of custom JAVA code.

But that still leaves me in the Build Miester role, and I'd love to get out of that.  So I spent some time over the holidays and thereafter working towards getting GitHub Actions take over the work.  I'll work backwards from most recent to least recent to simplify (easy to hard), starting with SANER.

Saner runs four processes to complete a build: Two XSLT 2.0 based transforms of SANER.xml, and a PLANTUML transform of outputs produced by that to generate UML images from the images-source folder of the build, and finally ig-publisher.  I DON'T need to automate IG Publisher, because build.fhir.org handles that for me using the latest publisher and Github webhooks via the AutoBuilder.

But: I DO want to automate the transformations, and I'd also love to have an easy way to run the publisher locally from Eclipse or the command line.

The trick here is to use Github Actions.  There are three actions I need at least:

  1. Checkout the Files that have been pushed on a feature branch.
  2. Run a mvn build to perform the translations
  3. Checkin the updated files.
The Maven build will handle the translations using two plugins:
  1. The Maven XML Plugin
  2. The PlantUML Plugin
Finally, I'll also enable the FHIR IG Publisher to be run using Maven through the Exec Maven Plugin during the "site" phase of the build.

Wednesday, January 13, 2021

Supplemental Data Part 2

One of the challenges with Supplemental Data is in dealing with how to communicate it.  One can send it

  1. in references only, expecting receivers to then retrieve those elements needed,
  2. in references only, having previously communicated the referenced data to the receiver individually,
  3. as contained resources within the MeasureReport,
  4. in a Bundle with the MeasureReport,
There are some challenges with each of these.
The reference only model requires the sender to provide a back-channel to respond to queries for specific resources, with all of the concomitant security issues that raises.  Sending an HTTP request out from inside your IT infrastructure doesn't require cutting holes through firewalls, receiving requests does.

The references with previously send model runs into problems with resource identity, eventually two different facilities will produce the same resource identifier.  And if the resources are just forwarded as is and assigned a new identifier by the receiver, then the measure source then has to reconcile those identities.

The collection of supplemental data can lead to large files, which for some, violates "RESTful" principles.  The idea in FHIR is to use fine-grained resources, but containment of supplemental data could make the MeasureReport rather large.  Let's do some estimating on size here:

One of the largest hospitals in the US is in New York City, and has something like 2000 beds.  Reviewing hospital inpatient capacity metrics (via HHS Protect) shows me that there's a range of utilization of inpatient beds ranging from 45% to 85% across the country.  So, if we are building a report on 85% of 2000 possible inpatients that would be 1700 patients being reported on.  Using a reasonable definition for supplemental data (e.g., what we are using in this Connectathon) let's count the number of contained resources: For each patient, we'd have something like a dozen conditions on average*, one or two encounter resources, the current patient location, maybe a half dozen medications*, a two or three lab results, an immunization resource, a DiagnosticReport, and a ServiceRequest.  Call it 30 resources on average for each patient.  Looking at the test resources created for this connectathon, I can see that they average about 3K in size, maybe that would be 4K in real life.  Compressing this data, I can see about a 14:1 compression ratio.

So, 1700 patients * 30 resources * 4KB / resource = 199 MB
And compressed, this yields about 14Mb.

Over typical (residential) network connections, this might be 6 to 30 seconds of data transfer, plus processing at the other end (which can happen while the data is being transferred).  At commercial speeds (1Gb or higher), this goes to sub-second.  Parse time for 200 Mb of data (in XML) is about 7 seconds on my laptop, about 5% faster and about 70% of the size in JSON.

We've got about 6000 hospitals in the US, with an average of 150 beds per hospital, so the average size is going to be around 13 times smaller (and faster), so 15Mb files on average.  I'm more interested in sizes for the top decile (90%).  I can estimate those by looking at a state like New York (which happens to have hospitals at both extremes), and see that the 90% level is reached at hospitals with ~ 750 beds.  So 90% of hospitals would be reporting in around 1/3 of the time or less than that faced by the largest facility, and sending files of about 4.5Mb or less compressed, or 67Mb uncompressed.  I am unworried by these numbers.  Yes, it's a little big for the typical RESTful exchange, but not at all awful for other kinds of stuff being reported on a daily or every 4, 8 or 12 hour basis.

All of the data needs to be parsed and stored, and it could be a lot of transactional overhead, leading to long delays between request and response.  But my 200Mb file parsing test shows that to be < 7s on a development laptop running WinDoze.  My own experience is that these speeds can be much better on a Linux based system with less UX and operating system overhead.  

Posting these in a transaction bundle leads to some serious challenges.  You really don't want to have a several second long transaction that has that much influence on system indices.  This is where ACID vs. BASE is important.  We really don't need these to be handled in a transaction.

Honestly, I like keeping the supplemental data in the MeasureReport as contained resources.  The value in this is that a) these resources are snapshots in time of something being evaluated and b) the identity of those resources has little to no meaning outside of the scope of the MeasureReport.  In fact, their identity is virtually meaningless outside of that context.  BUT, if we were to instead put them in a bundle with the MeasureReport (a composition-like approach), it could make sense.  Except that then, one would have to make sense of the resource identities of these across multiple submitters, and that's just a big mess that your really need not invite into your storage infrastructure.

In any case, wee haven't made a decision yet about how to deal with this, although for this Connectathon, we will be taking the bundle approach.

* These numbers are based on some research I contributed to about 16 years ago based on NLP analysis of a large corpus of data collected over the course of a mid-sized hospital's year treating patients.

Sunday, January 10, 2021

Supplemental Data and Measurement

I promised myself I would write more (here and elsewhere) this year as part of my New Years Resolutions, and sadly, it's taken me more at least a week to write my first post.  But what a week, so I think I can be excused.

In any case, one of the things we are going to be testing with SANER this Connectathon is support for Supplemental Data.  Supplemental data in a measure allows the receiver of the measure to obtain additional information that can help them make sense of what is going on in an emergency response scenario.

As you start off in an emergency, especially a one related to some sort of infectious disease, there's a lot of information that you simply don't know at the start of the situation.  What is the impact of other disease (comorbidity) on patient outcomes?  What about patient age, gender, race or ethnicity?  What are other complications of the disease.  One only need to look at AIDS or COVID-19 to see all the things we don't know.  Shortly thereafter, one might ask, how can we start to use this information about comorbidity to help assess the degree of strain imposed on an institution, and perhaps even create new measures.

To get there, we need more than just measures, we also need to bring supplemental data to the table, to the place where it can be analyzed in depth.  While disease and rates of infection are the primary sources of strain on the health system, other factors such as risks (geography, age, gender, et cetera) and comorbidity can influence outcomes (complications, length of treatment and death).

Arguably, this level of detail is not needed for every situation, but it can be extremely valuable in cases where the disease outcomes are serious, or the number of infections is great.  In the early stages, it can help emergency responders to develop models to assess the impacts, and in later stages apply those models.

There are several challenges right now in using supplementalData with Measures that I'll be discussing in subsequent posts.