Tuesday, May 24, 2022

How Does CREDS work?

I'm getting back to some standards writing, this time on the CREDS project.  In case you haven't heard of this one, the (b)acronym decodes to Clinical Registry Extraction and Data Submission.  The challenge that CREDS is looking to solve for Clinical Registries is a very common one.  Data needs to be collected from clinical records, packaged up, and then be sent somewhere.

There are three key questions that any implementer has to answer in this kind of effort.

  1. What data?
  2. Where is it sent?
  3. How is it formatted?
In this post, I'm going to be focused on the first question.  The next two will vary by registry, but we can generally express the answers to 2 as "To the registry that is asking for it", and in this case, CREDS is a FHIR IG, so the answer to 3 is FHIR, but it's also a little bit more complicated than that.

What Data?

The what question comes in several parts:

  1. Who has it?
  2. In what (standardized we hope) form or forms can it be found?
  3. How do we find it?
Answers to the first question for clinical registry reporters include:
  • My EHR System
  • My Old EHR System
  • My Laboratory Information System
  • My Radiology or Imaging information system (RIS or PACS)
  • An ambulance (not ambulatory) information system.  Yes, you might have to dig through transport records in a standardized format.
  • The patient's primary care or specialty provider (in their EHR, RIS, or PACS)
  • In a third party's information system
  • Through an HIE
  • Through some form of national data exchange network
  • The patient's payer (typically an insurer, or a state or nationally funded program).
The second question can varyingly be answered as:
  • FHIR, and if so, problem solved
                                            ^ (nearly, more on this later)
  • CDA
  • HL7 Version 2
  • DICOM
  • NCPDP
  • X12
  • or various others, and finally
  • the worst case Digital paper ... e-mail, pdf, printed text reports -- which I'm going to take a hard pass on for this blog post.
The third question (how do we find it in the data) is not quite so simple, and yet has one answer, at least for CREDS, and is the key to how CREDS works.  You see, CREDS relies not just on FHIR, but also FHIRPath.  FHIRPath is a declarative programming language that allows one to express a pointer to one or more pieces of data in a data model, or perform a computation based that data, or answer a yes or no question about it. It originated as a way to point into FHIR Resources to express things like the location of search parameters, or constraints on a refined model (a profile of a FHIR Resource) that must be met for a resource to comply with the profile.  For those that have been around for a while, you can see that FHIRPath looks very much like XPath, and for some constrained versions of FHIRPath, there are some very simple transformations from FHIRPath to XPath expressions.

I've been working with XPath (and XSLT which relies very heavily on it as a transformation language) before it was a W3C Standard, while it was being invented, with the inventors working in cubicles or offices right next to my own.  You can do just about anything with XPath that you can in any other programming language, though I'm not quite sure you'd necessarily want to. It takes a twisted brain to think in declarative form.  It takes a bit less twisting in FHIRPath because the creators of that language had similar experiences and already had some ideas about what worked and what doesn't.

But declarative has its own value.  The point of a declarative program is not in specifying how to get the answer in stepwise form, but rather to define what you want the outcome to be, and let the system decide how to get there.  The value add of the system is in defining how how to perform these operations efficiently.  Programmers use systems like this all of the time.  Consider SQL (also primarily a declarative language, though it has some procedural elements).

CREDS says that a registry defines a logical model describing its data, and that logical model is made available to users of the CREDS IG via an information system.  This may be nothing more than an excel spreadsheet with element names and data types, or it might be an entity relationship model expressed in a variety of standard forms.  Since CREDS is using FHIR, CREDS requires that these logical models be expressed in a FHIR format.  For that, we have the FHIR StructureDefinition.  Yes, the same creature that defines the logical model for FHIR Resources and extensions and FHIR Datatypes is also used to define logical models.  If you want to find a logical model for a registry, you have more than 2 dozen search parameters on StructureDefinition you can use to find it.

FHIRPath, interestingly enough, also is designed to work with logical models, although it is principally intended for use with FHIR, it is NOT limited to that model.  I'll show why that's important in just a bit.

Each element in a FHIR logical model (StructureDefinition resource) can be mapped to one or more other standard formats. The language in which this mapping is expressed is left unrestricted in the base FHIR Standard, though it should have a mime type (which is defined by FHIR to be text/fhirpath [and is recognized by IANA as such).  CREDS says that you must use FHIRPath in your mappings.  Why?  Because with those mappings, you can now locate the actual data element in the mapped to standard that reflects the intent of the author of the definition for the registry data model ... in a computable manner.

Furthermore, to successfully use CREDS, the mapping identifiers used in the StructureDefinition resource for the logical model must also be bound to a separate logical model that describes the mapped-to standard.  And that logical must be defined using StructureDefinition resources.  This is important because it makes it possible to automate queries to extract the necessary data from the larger assets that contain them (CDA Documents, V2 messages, et cetera).

So, now to the meet of how CREDS actually works:
  1. Download the Registry's logical model from a FHIR Server by querying for the StructureDefinition resource(s) you need.
  2. For an identified patient, and for each system that might have data, collect the relevant data for that patient from the system.
    1. By first mapping the patient identity you know to the patient identity in the system that you are going to query.
    2. And then query for and extract the relevant data assets (FHIR Resources, CDA documents, or V2 messages).
      1. To query for FHIR Resources in this example, it is enough to request Patient/$everything, and let the FHIRPath mapping perform filtering.  This is a naive implementation, you could be smarter, and CREDS will have some guidance both about writing mappings and how to use those mappings to get to the document you need.  A key assumption in CREDS is that the mapping is explicit, it does not assume that anything other than patient identity and date are available as pre-established contextual cues to extract the data.
      2. To query for CDA documents, well, you have a couple of options, but the one I'd start with is already FHIR enabled, and is basically the equivalent MHD query for all patient documents for a given time range for the patient.  And if you happen to need to query to an XDS/XCA based registry / repository (as might be the case for QHINs under TEFCA, or most national networks already today), there's a way to bridge from MHD to XDS/XCA queries. I built one in personally in 2016, I worked with a team that shipped another one to production a few years later.  It's baked into the design.  After all, MHD started out with the XDSEntry resource.  Others have done the same, including Grahame Grieve.
      3. V2?  CDA showed the way.  While there is NO "standard" way to do the same for V2 messages, I've also built (and shipped) a FHIR based query modeled after MHD to an HL7 V2 message repository.  This is not rocket surgery, it's more like brain candy.
      4. Got another format (e.g., NCPDP, X12)?  Same thing, different format.
      5. Finally, for each element in the Registry's logical model, apply the appropriate FHIRPath expressions over the extracted resources to collect the information in the logical model.
  3. Having collected the data to populate the registry's logical model, you could just send it in logical model form and let the registry take it, but that's no longer FHIR, it's just FHIR-like, or perhaps even FHIR-lite.  So ...
  4. A smart registry will define their logical model in (as close to as possible) a FHIR format.  That's a small lift, but certainly worthwhile.  Registries which define their submission models in FHIR format don't need the final CREDS step of transforming the data.
  5. We sort of cheat on the very last step for registries which don't do that work, and say, the submission to the registry goes only so far as FHIR as a standard, so if the registry logical model isn't already written in FHIR (and the various regional standards such as the HL7 FHIR US-Core or USCDI (or USCDI+), then they must also supply a computable transformation using the FHIR StructureMap to convert information structured in their logical model format to the FHIR Bundle.  CREDS won't go into much detail there, and it really need not do so.  There are implementations that can apply a structure map to perform transformations from one logical model to another.
That's enough on CREDS for tonight and it's NOT what my project manager wanted me to do, but it's really about getting my thoughts down in pencil before I whip out my pen.  

Surely there's more to be written, I have a list.  Here's a few of the topics I plan on taking up shortly:
  1. How to go from a mapping to a queryable repository in FHIR.  In this, I'll answer the question about how a registry can specify mappings in a way that makes it possible for different institutions to query different repositories or networks for assets to collect.
  2. Optimizing queries. The principles of writing FHIRPath expressions in a format that enables users to distinguish between the initial query to send to an external part and the filter to apply after asset retrieval to collect the data, and some thoughts on how to merge queries for the same kind of asset (e.g., a FHIR Resource type, a CDA document or a V2 message).  I suspect that the first implementation I work on in any way will be naive (Patient/$everything), but I'm certain I can do better.  FluentQuery would be a big help here, but at the moment, I've got the only implementation I know of from the SANER Project.
  3. FHIRPath for CDA. If I get to this one in 2022, I will declare the year a grand success, because it means I will be writing an implementation or better yet, that someone else beat me to it.
Personally, I think CREDS project will result in a visionary IG, but it's goal is NOT to boil the Atlantic Ocean, or even Lake Ontario.  I'll settle for Lake Kerr* for now, understanding that if I but turn up the FHIR, I could use it take on anything.

For those that don't know, the main inspiration for how we are doing things in CREDS is based on the work I did over the last two years with the SANER Project.  For that project, we needed to do two things:
  1. Extract data from patient medical record in various places, and
  2. Count the things that we extracted.
CREDS is just refining how to do the first part, and not just for data that exists in FHIR.  Because lord knows, it's not all in FHIR yet, and it will probably remain that way through my lifetime.

     Keith

*  Lake Kerr is within sight of where I wrote Happy Tears, and I walked by it with my mother last week a few days after her successful cancer surgery.

Thursday, May 19, 2022

Happy Tears

This stream blew up on twitter, so I thought I'd share it here as well (with edits since I don't have to worry about length limits).