Monday, May 24, 2021

A Source Code Profiler for FHIRPath

I wrote my first source code profiler in 1992, mostly as an exercise to play with but also because back then I didn't have the money for a professional one, and wrote about it in Dr. Dobb's Journal.  That was a sampling profiler, and used the timer interrupt on a PC to determine where in the code the application was running.  Since then, I've had to write my own profiler several times professionally for different reasons and purposes.  Once I wrote some code to instrument some BASIC code and profile it.  That code was running a web site for a major hardware vendor's web store (having been flown onsite to address a critical sporadic outage).  While the profiler didn't show what was causing the outage, it did demonstrate that the application our company had developed wasn't the source of the problem, and I later DID determine what the cause was and we fixed it.  I was given my first handheld computing device as a reward for that save.  And I've been called to develop profiling tools for other custom languages and application implementations as well.

The focus of this week's FHIR Connectathon for SANER was performance and scaling.  We finished up at noon yesterday, but I was already wire to the rim with caffeine, so I looked for something else interesting to do.  As a result, I instrumented my version of the FHIR Path engine (it's basically a copy of the HAPI engine with some tweaks for use as a SANER Measure Computer) to profile the Measure code.

There are three places in SANER where a FHIRPath is evaluated to compute a measure:

  1. Measure.group.population which produces the counts for the population.
  2. Measure.group.stratifer which produces the strata for each element in the population.
  3. Measure.supplementalData which specifies how to collect supplemental data for each element in a population.
Each of these may have an associated FHIRPath expression, and of course, I want to know how the expression performs.

The FHIRPath engine produces a parse tree of ExpressionNode items.  The tree is walked by FHIRPathEngine.evaluate functions.  What I did was simply added some code to trace the execution times in those functions in my FHIRPath engine code.

The core class and methods to support tracing in my engine is below.  Note, this is hacked Proof-of-Concept code I wrote in about an hour.  Documentation and everything else is coming.

   class Trace {
        String expr;
        long start, end, children;
        ExpressionNode node;
        int parent;
        int index;

        private Trace(String expr, int index) {
            this.expr = expr;
            this.index = index;
            this.start = System.nanoTime();
            this.parent = -1;
            Object o = new Object();
            o.toString();
        }

        private Trace(int parent, int index, ExpressionNode node) {
            this.parent = parent;
            this.expr = (parent == -1 ? null : traces.get(parent).expr);
            this.index = index;
            this.node = node;
            this.start = System.nanoTime();
        }

        public void done() {
            this.end = System.nanoTime();
            long childTime = end - start;
            top = this.parent;
            if (top != -1) {
                traces.get(top).children += childTime;
            }
        }

        public void done(ExpressionNode node) {
            this.node = node;
            done();
        }

        public String toString() {
            StringBuilder b = new StringBuilder();
            if (parent == -1) {
                b.append(",,,,,,,\"").append(expr == null ? " " : expr.toString().replace("\"", "\"\"")).append("\"\n");
            }
            b.append("@").append(Integer.toHexString(expr.hashCode())).append(",").append(index).append(",")
                .append(parent).append(",").append(start).append(",").append(end).append(",")
                .append(end - start - children).append(",\"").append(node.toString().replace("\"", "\"\""))
                .append("\"");
            return b.toString();
        }

        public void setExpression(String expression) {
            this.expr = expression;
        }
    }

    public void dump(File f, boolean append) throws IOException {
        try (FileWriter w = new FileWriter(f, StandardCharsets.UTF_8, append); PrintWriter pw = new PrintWriter(w);) {
            pw.println("Node,Expr,Index,Parent,Start,End,Time,Expression");
            for (Trace t : traces) {
                pw.println(t);
            }
        }
        traces.clear();
    }

    private Trace trace(String path) {
        if (!TRACE) {
            return dummy;
        }
        Trace trace = new Trace(path, traces.size());
        traces.add(trace);
        top = traces.size() - 1;
        return trace;
    }

    private Trace trace(ExpressionNode node) {
        if (!TRACE) {
            return dummy;
        }
        Trace trace = new Trace(top, traces.size(), node);
        traces.add(trace);
        top = traces.size() - 1;
        return trace;
    }

Sprinkled in key places in functions that actually execute the profiling code are the statements (in a slightly larger font) in the code below:

    public List<Base> evaluate(Base base, ExpressionNode expressionNode) throws FHIRException {
        Trace trace = trace(expressionNode);
        try {
            List<Base> list = new ArrayList<Base>();
            if (base != null)
                list.add(base);
            log = new StringBuilder();
            return execute(new ExecutionContext(null, base != null && base.isResource() ? base : null,
                base != null && base.isResource() ? base : null, base, null, base), 
                list, expressionNode, true);
        } finally {
            trace.done();
        }
    }

What goes on is that the call to trace(expressionNode) creates a new timestamped execution record to a stack of records which also points to the record of the calling method [often at the top of the stack].  Then, when the function is finished, trace.done() simply adds the ending time stamp, accumulates the time of this call to an accumulator for child times in the parent record, and returns the stack pointer to the parent's record.

When a user of the engine calls the dump() method with a file, the execution data is dumped to that file in CSV format.  Records look something like this:

Expr         Index Parent Start         End         Time Expression
@63efb167 1 -1 611441841022000 611451511487500 4711600 findAll('Encounter', ...
@791ea926 4 3 611441845767800 611441845782600 13400 'Encounter'
@791ea926 5 4 611441845771000 611441845772400 1400 'Encounter'
@6344cac3 6 3 611441845783900 611441847333500 12700 including('subject', 'diagnosis', ...
@6344cac3 7 6 611441845785400 611441847322300 1533300 including('subject', 'diagnosis', ...
@27688320 8 7 611441845786100 611441845787600 1200 'subject'
@27688320 9 8 611441845786600 611441845786900 300 'subject'
@39126a23 10 7 611441845788000 611441845789200 1000 'diagnosis'
@39126a23 11 10 611441845788600 611441845788800 200 'diagnosis'
@35c9a3be 12 7 611441845789500 611441845790400 700 'reasonReference'
@35c9a3be 13 12 611441845789900 611441845790100 200 'reasonReference'


The expression identifies the unique expression from the parse tree being evaluated.  The index and parent describe the evaluation tree (which may evaluate the same expression multiple times with different inputs).  The start and end time reflect the timestamps (in nanoseconds from an arbitrary base).  Time represents the elapsed time (again in nanoseconds).  The expression gives the full text of the evaluated expression.

You can take this data into a spreadsheet, create a pivot table report, and come up with useful reports on those parts of the expression taking the longest, as shown below.

Expression   Time (ms)  Executions
onServers(%Base).exists()         9145.7723         282
onServers(%Base).where(...         405.8266           2
$total | $this     47.0435          94
%ReportingPeriod.start - 1 'year'   10.4663         423


I normalized times to milliseconds so that the result would be more meaningful.  This shows that the retrieval of primary data takes about 9.1 seconds, and a secondary retrieval about 0.4 seeconds, while everything else is 50ms or less.

It also demonstrates some opportunities for optimization in the FHIRPath engine.  For example the 10ms taken on the the multiple evaluations of the common sub-expression:
  %ReportingPeriod.start  - 1 'year' 

This is likely more costly than it needs to be.  Since the FHIRPath engine today does nothing with common subexpression elimination, the profile shows the cost associated with re-evaluating these in the parse.  That's probably an opportunity for a whole other post.

    Keith


Tuesday, May 18, 2021

SANER Scales

This week's HL7 Connectathon 27 SANER Track is all about scalability.  For this Connectathon, we brought over 2 million clinical resources worth of test data for 2600+ patients, transported from Synthea's 100K patient COVID-19 test data set from Massachusetts into Cook County, Illinois, transposed in time to January of 2021 instead of March of 2020, across 15 hospital locations.  Because these patients were transported from the past, they aren't vaccinated, don't have access to Remdesivir, and probably too many are taking HCQ, but it was a good enough set to test the scalability of SANER.

My chief concern was computing MeasureReport on a daily basis, overnight, and whether that was going to be a huge data load for hospitals.  As it turns out, I'm able to put those concerns to rest (for me at least).

We computed 465 MeasureReport resources, one for each of 15 hospitals over the 31 days of January, using realistic hospital loads drawn from current statistics reported by the Illinois Department of Health.

Each measure report communicated around 240 (average) supplemental data elements (FHIR Resources) providing additional data to support stratification and analytics, which about 40 times what would actually be needed if just communicating metrics.

All told, this represented about 465Mb of uncompressed, pretty printed FHIR Resources in XML format, or about 23Mb of data compressed using GZIP.

Best, yet, I was able to collect data from the cloud, compute measures, store them locally and transmit all the data for all days for all hospitals to a server in the Cloud in about 11 minutes on a pretty high-end Dell Laptop (6 cores, 3.6Ghz Burst, 32Gb of RAM).

I've still got some bugs to look into which might slow things down once fixed (mostly on stratification), but with 12 virtual processors running, this load barely touched my machine.  Overall, CPU utilization was at a pretty steady 20%, and network bandwidth also nowhere near saturated.  My home office gets about 150-200Mb down, 20Mb up, I barely touched it.

I can process the data for a single hospital day in 10-20 seconds depending on the number of patients.  It's realistic to assume that more frequent, semi-real-time situational awareness measure evaluation and reporting is not only feasible, but also practical.

Most of the measures we have examined are written in a form that supports daily computation.  We'll probably have to experiment with measures designed for more frequent evaluation.

   Keith

* We keep hemming and hawing about near-real-time measures, and I've finally decided to call them semi-real-time, to clarify that they could be several minutes out of date, but still orders of magnitude better than daily.  With enough concentration, semi-real-time could in fact become near-real-time (so long as the data sources themselves are frequently updated).


After doing some more tweaking I'm actually:
  1. Overwhelming my server so hard it requires a restart to come back to life.  I really need to get my server set up in a production ready way.
  2. Running a bit slower but getting more data (so now it's taking about 28 second a hospital on average).

Tuesday, May 11, 2021

Tracking Supplies via SANER

One of the outstanding challenges yet to be demonstrated at a Connectathon is the ability of SANER to report on supplies.

Some common examples include N95 Respirators (a.k.a. Surgical Masks), Ventilator supplies (e.g., tubing, connectors, et cetera), gloves, gowns and cleaning supplies.

How would you track these materials in ways that would enable an inventory system to track them?  

Honestly, this is an area that I've spent very little time with, but is one focus area I plan to study for the upcoming May HL7 Connectathon.

What I already know:

All of this knowledge is captured, registered, and publicly available.  Some possibly for purchase, some freely accessible, some even with an API.

So, if you have UPC numbers or GMDN codes, or GUDID data, you might actually be able to create a value set that you can get to from the item codes used to describe supplies in the hospital inventory control systems.