Wednesday, April 29, 2020

Local First .. A SANER Approach

As I think about various models for communicating data to public health agencies, I keep thinking about a federated model, where hospitals push to their local public health network, and the local public health authorities then push data upwards to state and federal agencies.  There's a good reason for this, based on my own experience.  I live fairly close to Boston, and lived even closer in 2013, the year of the Boston Marathon Bombing.

Boston emergency management officials immediately knew when the bombs first struck what the state of the EDs were in the area, and were able to mostly route patients appropriately, and coordinate efforts.  While that same article notes that the number of available operating rooms and ICUs was not known, it also mentions practice and drill which very likely made it possible for hospitals to quickly clear and prepare operating rooms to treat incoming patients.

I think also about what's happening in thee City of Chicago right now, with Rush Medical coordinating efforts to capture data for the City's public health department, and then local public health passing that same data on to federal agencies on the hospital's behalf, and it just makes sense.  It certainly makes a lot more sense than what I've heard elsewhere, where hospital staff are having to collect data, log into different portals and send data to local or state public health, and then also to two different federal agencies, all the while a slightly different data feed containing similar data is silently being sent to the state department of health from a past program intended to meet the very same need.

I can't and won't argue the point that FEMA and CDC both need the data that is being requested.  But I will say that there should be a local public health network that supports this sort of communication without placing additional burdens on hospital staff.  Let the locals push to the state, and the state to the federal government as needed, and when needed (e.g., in cases of a declared emergency).  Don't make 6000+ hospitals do the same thing twice or thrice (even if with different data sets), when 50-odd state agencies could do it more efficiently and in bulk with better quality control.  Oh, and maybe fund that (or use existing funds that have already been allocated for that very kind of thing).

And when the emergency is over, the state or local public health agencies should still keep getting what they need to address local disaster response, much like what Boston had during the Marathon bombing.  It's too late after the disaster happens to "turn it on", and in fact, the switch might not even be accessible if you wait that long.

Compare the Boston stories to Dirk Stanley's story about being at the epicenter of 9/11, and you'll see that we've come a long way in handling local disasters, but still we can do better.  Even with Boston's amazing response, there are notes in some of my reading about it regarding the lack of information about operating rooms and ICUs.

For me, The SANER Project might have been inspired by COVID-19, and one nurse informaticist's complaint to me about the crazyness she was experiencing in trying to get data where it needed to go, but I've spent the last decade an then some looking at the challenges public health has been facing since AHIC first offered ANSI/HITSP what some of us still call "The Bird Flu Use Case", and which was preceded by the "Hurricane Katrina" use case, and before than the "Anthrax Use Case".  All of these were about public health and emergency response.  The standards we wanted weren't ready then, but they are now.  And so am I.  Let's get it right this time.


Monday, April 27, 2020

A SANER $convert operation for Converting a FHIR Resource to CSV format

One of the points of SANER is to make it easier for organizations to integrate with a FHIR Server to support measure reporting.  To address this, I introduced the concept of Coping Mechanisms, but need to take it a step further.  We also need to address some of the other missing interfaces to query for Measure and MeasureReport resources in a few different ways.

A MeasureReport is defined by a Measure, and carries with it a very small set of important data for the receiver in public health context.  Take for example, the Measure based on the CDC/NHSN COVID-19 Patient Impact and Hospital Capacity Module.  In CSV form, this needs only a few bits of information to be communicated:

  1. The facility identifier of the reporting facility.
  2. A unique identifier for the report.
  3. The date associated with the reported data.
  4. - 17. The set of reported values
How would one define an API to convert the MeasureReport to CSV form that would 
  1. Fit with the FHIR OperationDefinition concept, and 
  2. Yet be fairly simple for someone NOT understanding FHIR to use.
The $convert Operation seems to be a good starting point for this.

Obviously, _format (thinking in FHIR) or the Accept header (thinking in REST) should be text/csv for this operation to be acted upon as a CSV converter.

CSV output is well defined, but the conversion process isn't so well defined.  
One could arguably just dump the JSON path as the column header names, and the leaf node values (e.g., @value or @url).  In fact, there are several JSON to CSV converters that do just that.  While that would work, it misses requirement #2 by a long shot, and doesn't allow the user to control the column names.

This gets back to allowing the user control over naming things in the CSV header, and mapping each header to values in the report.  There are a number of values in the report that were mentioned above.  I could easily add a composite parameter like header=name$value to the report where name is a string giving the header name, and value is an expression giving the "path to the value" in the MeasureReport.
  1. For CDC/NHSN, this is the NHSN identifier.  We have an idea what this looks like, but no real examples.  It's fairly obviously either reporter.identifier or subject.identifier (and more likely the latter for reasons discussed later).  If I were to use FHIR Path, I could say something like: subject.identifier.where(system = 'NHSN Facility Identifier System URL').value and that would give me the right thing.
  2. This is either MeasureReport.id or perhaps better, MeasureReport.identifier.value (but for which identifier).  Since these are NHSN generated values (probably OIDs but we don't really know yet), the FHIR Path is probably something like MeasureReport.identifier.where(system='NHSN Report Identifier System URL').value.
  3. Easy: MeasureReport.date, or better yet, just date.
  4. Each of these is a group.measureValue, or a group.population.count (and note, I've started to drop MeasureReport from the FHIR Path, which is still valid, since MeasureReport is the context for the conversion).  But I have to identify which of these ... so group.where(code.where(coding.where(system = 'ick' and code='blech'))).measureValue or group.population.where(code.where(coding.where(system = 'ick' and code='blech'))).count.  Ick and blech are simply syntactic meta-variables which I named that way because as I'm writing these expression, my stomach is churning for the poor non-FHIR-aware end-user.
So, I like where this is heading, but FHIRPath is still to FHIRy for some.  How can we simplify this?

If MeasureReport is the context, can we relax FHIRPath a little bit so that it would be more friendly?

To start with, I really like Sushi's # notation for codes, so I could say something like group.where(code=ick#blech).measureValue.  And since I honestly don't care about system (since MeasureReport references Measure, and code systems are pretty well defined), I could further simplify to group.where(code=#blech).measureValue, which is getting better.

Could this be further simplified?  The measureValue field only appears in group, could there be another syntax to identify it?  In XPath, I might say //f:measureValue[../f:group/code ...], or *[code=...]/f:measureValue.  FHIRPath doesn't have a .. like traversal.  However, I might define *to mean any field, and thus use *.where(code=#blech) (where * is simply and alternate name for descendants()).

So, now we have *.where(code=#blech).measureValue, but why stop there?  How about making *S mean mean *.where(S) instead?  Now we have *code=#blech.measureValue.  This is getting better, but still not as good as it can get.  Code is an obvious index in FHIR, so in fact is system, and a few other "slicers".  In fact, the slicing pattern is a fairly common one.

What if * were a "slice" operator, where slices were attempted starting from near and heading to farther away, so that *#blech implicitly meant, a thing whose slice is identified uniquely by an identifier or code whose value or code was blech.  And *name# could mean, that thing whose slice is identified with an identifier or code system with a system URL of name.

There might be many such possible slices.  Each possibly slice would be ordered in size from smallest to largest (from inner to outer slices).  And we could provide a selection operator that worked on picking the right one using the . operator.  So *#bleck.measureValue simply means that measureValue who slicer is a code or identifier with the code or identifier of bleck.

Now, I might say something like:
header=*#positive.measureValue$positive,...

Can I go JUST one tiny step further and say that if you are slicing, take the simplest content you can for the slice, so that if the slicer is a code in a field X (e.g., found in X.code, X.category, et cetera), then the sliced value is very likely the next simple value you could report?

This might even be prioritized: 
  1. (.*V|v)alue if there is one, 
  2. (.*C|c)ode if there is one
  3. (.*N|n)ame if there is one
  4. the first primitive type if none of the above match.
And furthermore, if since *S always has # in it, I might further simplify to say that if S has #, then * is not needed.

Now I might say something like:
MeasureReport/32/$convert?_format=text/csv&headers=facilityId$NSHNFacilityIdentifierSystem#,reportId$NSHNReportIdentifierSystem,collectionDate$date,numTotBeds$numTotBeds...

And finally, if X$X, then I need say only X, so:
MeasureReport/32/$convert?_format=text/csv&headers=facilityId$NSHNFacilityIdentifierSystem#,reportId$NSHNReportIdentifierSystem#,collectionDate$date,numTotBeds,Numbeds,numBedsOcc...

But # means something special and needs escaping, so just use | like FHIR does in query parameters.

And finally, if X$X, then I need say only X, so:
MeasureReport/32/$convert?_format=text/csv&headers=facilityId$NSHNFacilityIdentifierSystem|,reportId$NSHNReportIdentifierSystem|,collectionDate$date,numTotBeds,Numbeds,numBedsOcc...

AND NOW that's an API that has a well defined meaning, maps back to FHIRPath with some additional rules, and makes some sense to the common user who's not seen FHIR yet.

Testing this against a few other resources:  If the slicer is a LOINC code, this would get Obserevation.value.  If the observation is a panel (e.g., urine test), I could get a CSV of the component values quite cleanly.

If the slicer is a string (for item.linkId in Questionnaire), it would get item.text, which is NOT quite what we want.  Let's add answer[X] to the prioritized list but what happens when there can be more than one answer?  Hmm, have to think about that one.  Perhaps that CSV answer is a comma separated list of values in the cell?  That's not clear.  It's probably good enough though for now.

I'll have to work up some grammar for this.









Sunday, April 12, 2020

Configuring Eclipse to run SUSHI over your Fish Tank to generate FHIR Resources and Profiles

As I write more and more FHIR Shorthand, I just want it to work better
in MY development environment.  I principally use Eclipse based tools (Eclipse and Oxygen XML Editor).

  1. I want a quick launcher.
  2. I want easy navigation to errors.
  3. I want better error messages.
  4. I want syntax highlighting.
Like I sometimes tell my kids.  Now you know how it feels to want.

I did manage to create a launch configuration, and you can see the content below.  So, now I've got #1, and you can have it as well.



Save this in a text file with the .launch extension


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<launchConfiguration type="org.eclipse.ui.externaltools.ProgramLaunchConfigurationType">
    <stringAttribute key="org.eclipse.debug.ui.ATTR_CONSOLE_ENCODING" value="UTF-8"/>
    <stringAttribute key="org.eclipse.ui.externaltools.ATTR_LOCATION"
       value="C:\Program Files\nodejs\node.exe"/>
    <stringAttribute key="org.eclipse.ui.externaltools.ATTR_TOOL_ARGUMENTS"
      value="&quot;C:\Users\{YOURUSERNAME}\AppData\Roaming\npm\node_modules\fsh-sushi\dist\app.js&quot; fsh -o ."/>
    <stringAttribute key="org.eclipse.ui.externaltools.ATTR_WORKING_DIRECTORY" value="${workspace_loc:/fhir-saner}"/>
</launchConfiguration>


Edit this file to fix the location where Node and Sushi are installed for you, and what fishtank you want it to run on, and then import it using "File | Import | Launch Configuration" into your eclipse environment.


Now, if you want to run SUSHI, you can easily do it.  I'm sure there's more that could be done to parameterize things like where node.exe is, and who the current user is, and how to run it on the default fish tank folder of the current project.  When I have time to mess with it, I may get around to it.  


I'm sharing what I've learned though, in the hopes that somebody else who knows how to make 'clipse stuff work might take up the task.




Saturday, April 11, 2020

Measure Participations for The SANER Project

There's so many people and organizations involved in creation of a computable measure, it's hard to keep it all straight.  That's because there's the "Measure Resource", and the measure it represents, and these are separate things, getting to the former from the latter is a process (executed by Publisher, so lets start)

Publisher

For The SANER IG, the publisher is going to be HL7.  So we'll follow the same protocols for the Measure resource as we would for the IG content at this time.  Other attributes determined by the publisher include: contact, copyright, status and experimental, because the attributes are about the Measure Resource, not the originally authored measure.  The status of the measure will certainly impact the status of the Measure Resource, but these describe the Measure Resource artifact, not the original measure content.  If you want to get into metadata about the author's original measure, I suggest you look into DocumentReference, which is intended to do just that.  We aren't going to use that right now due to complexity.

Author

Author is who wrote the content describing the measure, not necessarily put it into Measure Resource format.  For the Covid 19 Patient and Hospital Impact Module, that would CDC who wrote the form and instructions.  Having established that, the effectivePeriod for the measure becomes whatever the author says it is, which could be before the publication date as a Measure Resource.  That's legit.  The effective time captures the start and ending period over which the measure is approved to be in use by the author, at least to start.

Editor

The editor is who prepared the Measure resource for publication.  That would be The Saner Project.

Reviewer

For measures going through something like a ballot process, the reviewers are the balloters, or the organization (HL7) running that process.  There are other review processes, and I expect we'll eventually evolve those in The Saner Project.

Endorser

Endorsers are organizations that "approve" the measure as fit for a particular purpose -- certifying or regulatory, or similar bodies.  There can obviously be multiple endorsers.  A measure can be published without endorsement, and endorsements of a measure are asynchronous from publication.

Fighting COVID-19 with FHIR®
 

Friday, April 10, 2020

SDOH in COVID19 Measures and The SANER Project

While talking to some healthcare providers in my local, and a few other regions, I've heard statements about the apparent impacts of poverty on COVID-19 risk, mostly based on anecdotal evidence.  I honestly don't doubt it exists, and although I don't have the data available to prove it ... others do.

That led to a creation of a measure request based on social determinants of health in The SANER Project.

Not much after we added measure requests for staffing and supplies, CDC added two new COVID-19 modules into their reporting for similar items.  We had already agreed we were not going to spend much time deciding on "experimental measures" for our Connectathon release of the guide.  But I did reference the recently released CDC guidelines because it has the categories they think are important, and frankly, I don't need to second guess them.

Having thus concluded that for the purposes of COVID measurement, we'd try to use the CDC as an authority where possible, it occurred to me to look into how CDC was evaluating social determinants of health.  The National Center for Health Statistics publishes an annual report, titled Health, United States, and in it, you will find rather detailed descriptions of how they classify certain categories that impact Social Determinants of Health.

Age, Gender, Race and Ethnicity would likely be covered in an existing measure request, and the singular for "Gender" is likely to get into a discussion around gender, sexual preference, and sex at birth.  Given these are deemed a given in EHR systems certified by ONC, I think we can take it for granted that the data should be available, though perhaps not always readily accessible.  Age gets interesting because current reporting (available to the public) is in 20 year chunks, though I think I've seen some data in 10-year chunks and one which pulled out 0-2 for special attention, but other reporting looks at 0-18, 19-44, 45 - 64, and 65+.  We're rapidly reaching a point where 65 is no longer the age break for social security or medicare benefits, and the justifications for 18 because it is the age of adulthood is perhaps questionable.  I'd stick with what people are using for COVID reporting right now though (e.g., 10 or 20 year brackets up to 80) because it's relatively simple.

The regional classification associated with the patient (urban, rural, et cetera) is likely a readily available datum for stratification if you can get to the demographics for the patients counted by the measure.  But, as you can see if you clicked the link, there are at least 3 different classification systems that might be used.  Geodata can get to census tract from an address, hospital counties are readily accessible, and for -ish sorts of things, that might be good enough (though some note that the Grand Canyon is classified as metopolitan, which, if you've ever been, surely isn't).

That leaves a few other factors to address:

Disability

This one doesn't seem that hard.  In the Health, United States report, it's a simple three tiered classification no difficulty, some, a lot or cannot do (where the last two are clustered into one bucket). The determination is based on the report of one or more categories related to ability to function (see the report for details on how they classify).  If we want to make SDOH data useful, it should be aligned with where existing research has already gone.  I'd stick with the 3-tiered classification.

Education Level

It gets as granular as years of schooling, but the key categories are no high school diploma, diploma or GED, some college, bachelors or higher.  Some include AA degree as a stratum, but it's not much different from "some college" according to this chart.

Insurance

Do you have it?  Yes or No. If Yes, is it private, or is it Medicaid?  These are the important strata used in the report.

Food Security

I'm not really clear here on where to go.  There's not really anything I spotted in the Health US report, and well, it's late as I write this.  Honestly, I think income and housing are probably as indicative of food security.  But, I also learn, there's a Z-code for food insecurity in ICD-10.

Income Level

This one gets tricky, because it's different based on the size of the family unit, and it changes annually.  The Health, US report covers it as number of 100%-ile units above/below the HHS poverty guidelines generally, rather than as the census covers it with Poverty Thresholds.  If you want to understand the difference, Google it.  What's interesting here is that other research on poverty and health in the CDC uses two breakpoints: 130%, and 350%.  Part of the reason for that is that many Federal guidelines use 130% as a qualification point for certain types of federal assistance, and 350% splits the remaining population into generally equal sized chunks.  I'd go with the latter, because fewer is less work, and the 130% mark would seem to address some confounding challenges around food security.  But then there's a Z-code, and it breaks at 100% and 200%, and that aligns with Gravity work in HL7, and the PRAPARE tool in use by Kaiser Permanente and others.  I think we go with that because it's already accessible to some.

Housing Security

Again, not much to go on, but I'd guess there might be three big strata: Homeless, rent, or own. For homelessness, there's a Z-code in ICD-10 (see previous link under Food Security).

Employment Status

I'm not sure how good an indicator this is given the current rapid rise in unemployment.  I'm sure it's a factor though.  There's a Z-code for this in ICD-10.

My Proposed SDOH Strata for COVID

A lot of the data above is not readily obtainable without additional efforts on the hospital side, which is likely something to avoid.  What's likely already known: Insurance, Patient Address (and a proxy for homeless vs. Rent/Own), and functional/disability status (though not a completely), and possibly employment status.  The ICD-10 Z-codes are also somewhat in alignment with the PRAPARE tool that others have examined, including the Gravity project in HL7.  Z-Codes have the benefit of having been around long enough to already be in the EHR system.

So, what I'd go with is the minimum set of Insurance class (none, Private, Medicaid), the six Z codes covering employment, education, homelessness, food insecurity, and income level, as individual strata for COVID+/All patients.

This isn't a perfect stratification, and I'm sure we could debate the merits of other formulations.  It's going to be marked experimental (like anything other than the CDC/NHSN or FEMA measures for connectathon, and I think it's good enough to see what others can do with it.

   Keith

P.S.  It's amazing how analysis paralysis disappears when you need it yesterday, and you have to work with what's available now, not next month, or with a little more work and research, and that's a key attribute in your decision making criteria.




Thursday, April 9, 2020

How will SANER Cope with Existing HealthIT Infrastructure?

One of the principles of The Saner Project is that it is both a BIG change to the way we do things, but also a SMALL one.  Which is to say that the software components (which we are now starting to parcel out) will be able to work with existing and available interfaces, and can support better automation.  FHIR is the interlingua in which we might do the heavy compute and automation, but the transmission is much simpler.

The measures inside the MeasureReport resources that we've created thus far have a simple relationship inside to data elements as they flow over the wire.  We define the measure precisely so that we can simply extract and transmit the essentials, and effectively reconstruct it at the other end.

Those small pieces of software which connect those existing interfaces to a FHIR endpoint are "Coping Mechanisms", but that's a mouthful, so I'm calling them Copes.  What's a Cope?  It's something that allows two things to work or fit together.  In carpentry, it's a way to join things together at the bendy bits so that they make for tight seam that looks appealing.  There are a number of ways to cope a joint, some are more difficult than others, and the way you do it might depend on a number of other factors.  When both ends are somewhat mobile and fungible, a mitre box and cut at 45° will remove the least waste and provide a nice fit.  When one end is stuck where it is, the other end has to do or take all the work to make it fit.  We will probably need both.

I have thoughts about a dozen or so Copes:
  1. CSV-to-MeasureReport
    • I'm already using one of these to convert data from The COVID Tracking Project to a Saner MeasureReport (using the FEMA Measure).  It could have been put together in about an uninterrupted day's work, not that I have such at the moment, so it was a night's work.
  2. MeasureReport-to-CSV
    • This is even simpler in some ways.  The critical data in a measure is well identified, just yank it out and put in into an orderly set of rows following a header.
  3. CSV-to-XLSX
  4. XLSX-to-CSV
    • This is really just a hack to deal with the FEMA Spreadsheet.  FWIW: One should never send native Word or Excel documents around if one doesn't want one's name and institution to be known.  It's buried in the metadata in the spreadsheet.  Realistically, sending FEMA data should be done via CSV and not XSLX.  But I know who to talk to about it because they didn't clean the metadata before it was published.
  5. CustomJSON-to-MeasureReport
  6. MeasureReport-to-CustomJSON
  7. CustomXML-to-MeasureReport
  8. MeasureReport-to-CustomXML
  9. ER7-to-MeasureReport
  10. MeasureReport-to-ER7
    • The six above are simply recapituations from structured to differently structured, where the critical bit is mapping from MeasureReport to data fields in a custom thing, or vice versa.  In the last two, the ER7 acronym refer to HL7 pipes and hats format, and are really about extracting specific values out of a V2 message to populate a MeasureReport.
  11. Aggregator
    • Collect a bunch of MeasureReport values, add them up and spit it back out as an aggregate report.  How does one collection?  By time?  Geographic region?  Hierarchical structure of some sort (e.g., city/town, county, state, region, nation.
  12. Push-Me-Pull-You
    • If A wants to push, and B wants to pull, they cannot talk to each other.  The Push-Me-Pull-You Cope sits between the two, and acts as sort of a store and forward channel.  This BTW, is simply a classic FHIR Server, although we will see customizations on the Pull side to support different kinds of search.
  13. Pull-Me-Push-You
    • Similarly, if A is expecting pull, and B is expecting to be pushed to, we need to put the Pull-Me-Push-You in the middle to periodically collect and transmit data from A to B.
  14. V2-to-FHIR
    • I just happen to have one of these laying around, but so do others.  HL7 O&O has been working on this project in HL7 for the past year and more.  Yes, there are some useful feeds that contain observations that are exactly about Situational Awareness of groups of things.  Mine used to be configured using a ConceptMap.  No human should ever have to write so many angle brackets ... it's crazy making.  Thanks to Sushi, we don't, and can remain saner.
  15. FHIRtoYAML
  16. YAMLtoFHIR
    • It's about time we did these two.  I don't have any excuse other than it seems like a good idea at the time (then again, it is 1:13am as I write this)
14 copes sounds like a lot to write.  I'm hoping that entrants into The Resiliency Challenge can help with these.

Oh, and one coping mechanism that I don't think we'll spend much time on with Saner?  That's not really an effective coping mechanism after all, that's fingers on keyboards.  No, I'm hoping the keyboarding is all about writing code to getting data from where it is, to where it needs to be, without unnecessary human intervention (which doesn't mean it lacks human oversight).

The point about Saner isn't to replace existing infrastructure immediately.  We need to cope with what we have, and that's what makes Saner, well .. Saner.

Friday, April 3, 2020

What is 6000 * 5 * 4 * 7 * 52 / 60 / 24?

To damn much if you ask me.

6000 hospitals.
x    5 minutes / form reported
x    4 forms reported / day
x    7 days / week
x   52 weeks / year
--------------------------------------------------
= 30333.333 hospital minutes

How fast could a computer report that if we but had the right infrastructure?  Surely 0.333 minutes would be enough.

I do NOT want to hear someone's brilliant idea about how to use the web to solve the COVID surveillance problem one more time if all they can do is implement <FORM>.  Sure, it's FORM on top of a Bootstrap/React UI with Node.js and NOSQL backend, but the front end input is still a human at a keyboard, the same as it was when Tim Berners Lee invented HTML in the mid-90's.

I do NOT want to hear about one more front-line medical profession typing in stuff into a form.

FIND ANOTHER WAY.

Use a camera, a sensor + arduino, an ultrasonic tag, or spare printer parts.  Do whatever it takes to enable people in a crisis to operate at the top of their license, not the bottom.

If "all you need is a phone", understand that in that phone is a supercomputer that is 100,000 times more powerful than necessary to land on the moon.  Use it the damn thing.

   Keith