Friday, March 31, 2017

The Longitudinal Identity of a CCD

This question comes up from time to time. For a given patient, how is there a unique identifier which uniquely identifies the CCD for the patient as it evolves over time.

The answer is no, but to understand that, we need to talk a little bit about identifiers in CDA and how they were intended to be used.

Every CDA released into the wild has ONE and only ONE unique identifier by which it is uniquely known to the world.  That is found in /ClinicalDocument/id.  During the production of a clinical document, there are some workflow cases where the document has to be amended or corrected.  And so there is a need to identify a "sequence" of clinical documents, and possibly even to assign that sequence an identifier.  The CDA standard supports this, and you can find that in /ClinicalDocument/setId.

BUT... that field need not be used at all.  You can also track backwards through time using /ClinicalDocument/relatedDocument/parentDocument/id to see what previous version was revised.  And the standard requires neither of these fields to be used in any workflow.

So ... couldn't I just use setId to track the CCD for each patient?

Yes, but fundamentially, you'd be doing something that fails to take into account one of the properties of a ClinicalDocument, and that is context.  Context is the who/where/what/when metadata associated with the activity that the clinical document is reporting on.  When setId is the same for two clinical documents, the assumption is that the Context associated with the content is the same, OR, is being corrected, not that it is being changed wholesale.

The episode of care being reported in a CCD is part of its context, as is the when the information was reported.  If you want to report on a different episode of care, it's not just a new document, it's also a new context.  And that's why I suggest that setId should be different.

This is mostly a philosphical debate, rather than one related to what the standard says, but when you think about the history of clinical documents, you might agree that it makes sense.

Clinical Documents aren't "living" documents.  A key definition of a CCD document is a summary of relevant and pertinent data "at a point in time."  It's that "point in time" part of the definition that makes the CCD document a static item.

Thursday, March 30, 2017

Diagnostic Imaging Reports in a CCD

I clearly missed something somewhere, probably because I assumed nobody would try to include a document in an document after having hammered people about it for a decade. My first real blog post was on this topic.

Here’s the challenge:  According to the Meaningful Use test procedures for View Download and Transmit: Diagnostic imaging reports are to be included in CCD content.  The test data blithely suggests this content:
·         Lungs are not clear, cannot rule out Anemia. Other tests are required to determine the presence or absence of Anemia.

I can see where this summary of a full report might appropriately appear in “Results” section of a CCD document, but this isn’t an diagnostic imaging result. Here’s a some sample Diagnostic Imaging Reports:,  I’m reminded of Liora’s “This is a document” slides she uses in her Intro to CDA class, and for good reason.

The content might be stored as a text report, a word document, a PDF, or even worse, a scanned image.  It really depends on what the supplier of the report provides.

The NIST guidance is sub-regulatory, but these are the testing guidelines set forth for the certifying bodies.  However, what I also missed is that the regulation also says that CCD is the standard for imaging reports.  It's in that line of text that reads:

(2) When downloaded according to the standard specified in § 170.205(a)(4) following the CCD document template, the ambulatory summary or inpatient summary must include, at a minimum, the following data (which, for the human readable version, should be in their English representation if they associate with a vocabulary/code set):

(i)Ambulatory setting only. All of the data specified in paragraph (e)(1)(i)(A)(1), (2), (4), and (5) of this section.

(ii)Inpatient setting only. All of the data specified in paragraphs (e)(1)(i)(A)(1), and (3) through (5) of this section.

Clear as mud right?  Here's what (e)(1)(i)(A)(5) says:

(5) Diagnostic image report(s).

Oh damn.

But wait? I can create a DIR, change the document type and header details a bit, and then magically it becomes a CCD.  So, can I create a CCD for each Diagnostic image, and in that way have a "summary" representation of the report.

Nope: Back to the test guide:

3. The tester uses the Validation Report produced by the ETT: Message Validators – C-CDA R2.1 Validator in step 2 to verify the validation report indicates passing without error to confirm that the VDT summary record is conformant to the standard adopted in § 170.205(a)(4) using the CCD document format, including: the presentation of the downloaded data is a valid coded document containing:

  •  all of the required CCDS data elements as specified in sections (e)(1)(i)(A)(1); 
  •  for the ambulatory setting, the Provider’s Name and office contact information as specified in section (e)(1)(i)(A)(2); 
  •  for the inpatient setting, admission and discharge dates and locations, discharge instructions and reason(s) for hospitalization) as specified in section (e)(1)(i)(A)(3);
  •  laboratory report(s) as specified in section (e)(1)(i)(A)(4), when available; and 
  •  diagnostic imaging report(s) as specified in section (e)(1)(i)(A)(5), when available.  
Oh well.  Seems like I need to get my hammer out, this time to fit an entire document into a sentence shaped hole.

Friday, March 17, 2017

Patient Access: It's coming at you faster than you might think

This crossed my desk this morning via POLITICO:

GAO: PEOPLE AREN'T LOOKING AT THEIR ONLINE DATA: The Government Accountability Office took aim at the accessible data requirement in meaningful use in a report released Wednesday. The report, requested by the Reboot group (which include Sens. Lamar Alexander and John Thune), argues that while the vast majority of hospitals and eligible professionals make data available for patient and caregiver consumption, the percentage actually following through isn't high — perhaps 15 to 30 percent, depending on the care setting and data analyzed.
Now, why that's the case is the familiar debate — is it a lack of interest from patients, or perhaps technical difficulties? A GAO analysis suggests the former is definitely at play. The office analyzed the top 10 most popular EHRs and found patient participation rates ranging from 10 to 48 percent.

Ultimately, however, GAO hits ONC for its lack of performance measures for its initiatives — whether, for example, using Blue Button increases patient uptake of data. HHS and ONC concurred with the recommendation.

Here's my take on this.

TLDR; It isn't as bad as the GAO makes it out to be.  The report is based on nearly two year old data, and based on it and prior data, we seem to be within reach of a major milestone: 50% of all patients having accessed their data.

Remember early fax machines? e-mail? These are technology diffusion challenges which took a number of years (even decades) to get over.  We've finally reached a stage where nearly 90% of all providers are capable of offering patients their data in hospital or ambulatory settings, and now people are really getting to use this stuff.

What is the GAO report telling us?  First of all, it is telling us about events in 2015, not 2016 as you might expect.  It takes the government a while to get their act together acting on the data that they gather, and the accompanying study also took some time to set up and execute.  This is in fact why many opponents of the MU deadlines said they were coming at us too fast, because we couldn't even analyze data from prior years to figure out how to course correct before it was time to update the regulations.  We are hopefully past that stage now.

Secondly, we need to look at this in a slightly different context.  Remember I said this was a technology diffusion challenge.  If you are a long time reader of this blog, you might recall an article I wrote about the curve you might expect for technology adoption.  It's a logistic growth curve.

The GAO numbers are saying we are around 30% for ambulatory use, and 15% for hospital use of data access by patients in 2015.  Where are we now?  It's hard to project forward from one data point, because fitting the logistic curve requires estimating two parameters, a time scale, and the inflection point.  The inflection point is at 50%, and is where the rate of adoption reaches its maximum value.

To make something useful out of this data, you have to go back to some similar ONC reports on patient utilization.  You can find the data in ONC Data Brief 30, which includes information from ONC Data Brief 20.  The challenge here is that the GAO report doesn't quite report the same thing, so you have to adjust a bit.  I know from a colleague of mine from early IHE years that some patients get most of their healthcare in the hospital setting (i.e., the ER), while others get their care principally from Ambulatory providers, and others have used both.  That means that some patients have been given access through a hospital, and others through ambulatory providers, and the number of patients who have been given access to their health data is overall, probably greater than the sum of the parts, but we know these are largely overlapping sets.  So, if I simply take the ambulatory data from the GAO report, and compare the number offered access and the number who used it, to similar figures from the previous ONC briefs, I can start to get somewhere.  Here's the basic data.

Year Offered Accessed Total
2013 28% 46% 12.9%
2014 38% 55% 20.9%
2015 87% 30% 26.1%

The number offered access is different in each year, so I have to normalize the results, which I do by multiplying the % of patients offered access to the % offered access who did access data, to get the total % of patients accessing records. That's the number we really care about anyway.

Now I have three points, which is JUST barely enough to estimate the parameters of the logistic growth curve.  How to fit the data?  Well, this paper was good enough to tell me.  Basically, you compute an error function, which in the paper was least squares (a common enough curve fitting function), over the range of possible parameter values.  So I did, and generated a couple of surfaces which show me where I might find the parameters that give the best fit.  Can I tell you how good the fit is?  Not really.  I'm working with three data points, and estimating two parameters.  There's only one degree of freedom remaining.  This is about as back of the napkin hack as it gets. 

Let's look at some pictures to see what we can see here:
First up, we need to look at the error surface to find the minimum.

Error Surface

We can pretty well see it's somewhere in the lower blue region, but we have to get to a much finer level of detail to find it perfectly.  Sometimes tables are better than pictures.  The minimum is somewhere in the middle of this table.  If I expanded data points in the table even further, you'd see they are getting larger all around the area we are focused on.

0.36 0.38 0.40 0.42 0.44 0.46
Jan-2017 1.03% 0.70% 0.44% 0.26% 0.14% 0.07%
Apr-2017 0.54% 0.30% 0.15% 0.06% 0.04% 0.07%
Jul-2017 0.23% 0.09% 0.04% 0.05% 0.12% 0.25%
Oct-2017 0.07% 0.04% 0.08% 0.19% 0.36% 0.58%
Jan-2018 0.05% 0.12% 0.26% 0.47% 0.72% 1.02%

The minimum error for this fit occurs somewhere in 2017, which is also the 50% saturation point.

There's still one more picture to show.  This projects us forward along the boundaries of the fitted range.  As you can clearly see, the projections show we are nearly at the 50% point.  That's a major milestone, and something to crow about.  It also tells me that unless there's another bump to push things ahead faster, we won't get to 90% access until sometime between 2021 and 2024, five to seven years from now.  We have just such a bump (API support) coming ahead in 2018.

It isn't all as dark and gloomy as the GAO report suggests, but it might have been if that report was telling us where we were now, instead of where we were two years ago.

This is rough and ready calculation.  I'm using data that was gathered through different means, and which may not be comparable.  I don't have enough points to make any statements about the reliability of the projects.

It's still good enough for me.  It shows that things aren't as bad as the GAO report might have suggested.  ONC and HHS really need to PLAN ahead for this kind of reporting, so that we can create the dashboards needed to produce this sort of information as it becomes available, instead of 2 years after the fact.

Data reporting for 2016 is just around the corner.  Two numbers: The % of patients offered access, and the % of patients who use it if it is offered will be enough to tell me how good my projection is for now. If those two numbers multiplied together come anywhere between 30 and 40%, I'll feel very comfortable with this projection.

Thursday, March 16, 2017

I got my Data

A while back several of us HIT Geeks and e-Patients were having a discussion about HIPAA, patient data access challenges, et cetera.  Prior to that I had written a post connecting the various dots between HIPAA, the Omnibus rule, MIPS and MACRA, and the Certiufication rule.

In that conversation I accepted an implicit challenge to get my health data via unencrypted e-mail.  I wrote to someone at my healthcare provider organization in early January and then got caught up in various meetings and never followed up.  Two days later I had gotten some resistance.  My healthcare provider has a portal, which I use and can quite easily get my data already, and in fact often do, which was probably another reason for resistance.

When I finally responded in early February with my acknowledgement of the risks and the fact that I understood them, I got my data the very next day.  I'd made my point.  Two emails and I had it.  Any delays in getting it were my own fault for not following up.

Caveats: I have a good relationship with my provider organization, and also know important thought leaders in that organization and they know me.  I was able to make points others might not be able to. But, when breaking trail, it's usually a good idea to put the person most experienced at it out in front, yes.  And that's where I was and what I did.

  -- Keith

Wednesday, March 15, 2017

Principle Driven Design

When you need to get something done quickly, and it's likely to involve a lot of discussion, one of the tactics I sometimes take is to get everyone to agree upon the principles which are important to us, and then to agree that we will apply those principles in our work.

It's quite effective, and can also be helpful when you have a solo project that is confused by lots of little different relationships between things.  If you work to establish what the relationships are in reproducable ways, and connect them, what you wind up with is a design that goes from a set of principles ... or even, simple mathematical relationships.  And the output is a function of the application of those principles.

It works quite well, and when things pop out that are odd, or don't work out, I find they are usually a result of some principle being applied inappropriately, or that your data is telling you about some outlier you haven't considered.  When HL7 completed the C-CDA 2.1 implementation guide in 6 weeks (a new record I think for updating a standard), we applied this tactic.

Having spent quite a few weeks dealing with the implementation details, I can tell you that it seems to have worked quite well.  And my most recent solo foray into obscure implementation details was also completed quite quickly.


Friday, March 10, 2017

Art becomes Engineering when you have a Reproducable Process

The complaint that software engineering isn't really engineering because each thing you make is often its own piece of art is often true.  But the real art of software engineering, isn't making one-offs.

Rather, it is figuring out how to take simple inputs into a decision making process that generates high quality code in a reproducable way.

I often build code that is more complicate then if I just wrote it because I take the time to figure out what data is actually driving the decision making process that the code has to follow.  Then I put that data into a table and let the code do its work on it.

When the process has been effectively tested, adding more data adds more functionality at a greatly reduced code.

This same method is, by the way, how FHIR is built.  FHIR is truly engineered.  The art was in developing the process.


Wednesday, March 8, 2017

Monitor your "improvements"

Sometime last year, to better manage my blog I thought I would try out Google+ Comments on it.

It turned out to be a disaster on three fronts:
1.  I could no longer delete inappropriate comments.
2.  Comments must have a Google+ account, a restriction I find inappropriate on this blog.
3.  I no longer received e-mails about comments on the blog, which has now put me six months behind answering questions I didn't even know where being posted.

All of that because I failed to monitor the impact of what my "intervention" did.  Don't I know better? Yes, I do.

Year before last I recall an presentation by AMIA by Adam Wright, PhD and fellow alum of OHSU on how changes to clinical decision support system resulted in a failure for certain notifications, and thus be acted upon.  While I cannot find the paper, a related poster is here. One of my favorite classes at OHSU was on how to measure the impact of an intervention quantitatively.

I should have been able to detect the low volume of questions, but didn't.  Fortunately in my case, I just failed to get feedback and had reduced capacity to use my blog.  That situation is now corrected.