I wrote my first source code profiler in 1992, mostly as an exercise to play with but also because back then I didn't have the money for a professional one, and
wrote about it in Dr. Dobb's Journal. That was a sampling profiler, and used the timer interrupt on a PC to determine where in the code the application was running. Since then, I've had to write my own profiler several times professionally for different reasons and purposes. Once I wrote some code to instrument some BASIC code and profile it. That code was running a web site for a major hardware vendor's web store (having been flown onsite to address a critical sporadic outage). While the profiler didn't show what was causing the outage, it did demonstrate that the application our company had developed wasn't the source of the problem, and I later DID determine what the cause was and we fixed it. I was given my first handheld computing device as a reward for that save. And I've been called to develop profiling tools for other custom languages and application implementations as well.
The focus of this week's FHIR Connectathon for SANER was performance and scaling. We finished up at noon yesterday, but I was already wire to the rim with caffeine, so I looked for something else interesting to do. As a result, I instrumented my version of the FHIR Path engine (it's basically a copy of the HAPI engine with some tweaks for use as a SANER Measure Computer) to profile the Measure code.
There are three places in SANER where a FHIRPath is evaluated to compute a measure:
- Measure.group.population which produces the counts for the population.
- Measure.group.stratifer which produces the strata for each element in the population.
- Measure.supplementalData which specifies how to collect supplemental data for each element in a population.
Each of these may have an associated FHIRPath expression, and of course, I want to know how the expression performs.
The FHIRPath engine produces a parse tree of
ExpressionNode items. The tree is walked by
FHIRPathEngine.evaluate functions. What I did was simply added some code to trace the execution times in those functions in my FHIRPath engine code.
The core class and methods to support tracing in my engine is below. Note, this is hacked Proof-of-Concept code I wrote in about an hour. Documentation and everything else is coming.
class Trace {
String expr;
long start, end, children;
ExpressionNode node;
int parent;
int index;
private Trace(String expr, int index) {
this.expr = expr;
this.index = index;
this.start = System.nanoTime();
this.parent = -1;
Object o = new Object();
o.toString();
}
private Trace(int parent, int index, ExpressionNode node) {
this.parent = parent;
this.expr = (parent == -1 ? null : traces.get(parent).expr);
this.index = index;
this.node = node;
this.start = System.nanoTime();
}
public void done() {
this.end = System.nanoTime();
long childTime = end - start;
top = this.parent;
if (top != -1) {
traces.get(top).children += childTime;
}
}
public void done(ExpressionNode node) {
this.node = node;
done();
}
public String toString() {
StringBuilder b = new StringBuilder();
if (parent == -1) {
b.append(",,,,,,,\"").append(expr == null ? " " : expr.toString().replace("\"", "\"\"")).append("\"\n");
}
b.append("@").append(Integer.toHexString(expr.hashCode())).append(",").append(index).append(",")
.append(parent).append(",").append(start).append(",").append(end).append(",")
.append(end - start - children).append(",\"").append(node.toString().replace("\"", "\"\""))
.append("\"");
return b.toString();
}
public void setExpression(String expression) {
this.expr = expression;
}
}
public void dump(File f, boolean append) throws IOException {
try (FileWriter w = new FileWriter(f, StandardCharsets.UTF_8, append); PrintWriter pw = new PrintWriter(w);) {
pw.println("Node,Expr,Index,Parent,Start,End,Time,Expression");
for (Trace t : traces) {
pw.println(t);
}
}
traces.clear();
}
private Trace trace(String path) {
if (!TRACE) {
return dummy;
}
Trace trace = new Trace(path, traces.size());
traces.add(trace);
top = traces.size() - 1;
return trace;
}
private Trace trace(ExpressionNode node) {
if (!TRACE) {
return dummy;
}
Trace trace = new Trace(top, traces.size(), node);
traces.add(trace);
top = traces.size() - 1;
return trace;
}
Sprinkled in key places in functions that actually execute the profiling code are the statements (in a slightly larger font) in the code below:
public List<Base> evaluate(Base base, ExpressionNode expressionNode) throws FHIRException {
Trace trace = trace(expressionNode);
try {
List<Base> list = new ArrayList<Base>();
if (base != null)
list.add(base);
log = new StringBuilder();
return execute(new ExecutionContext(null, base != null && base.isResource() ? base : null,
base != null && base.isResource() ? base : null, base, null, base),
list, expressionNode, true);
} finally {
trace.done();
}
}
What goes on is that the call to trace(expressionNode) creates a new timestamped execution record to a stack of records which also points to the record of the calling method [often at the top of the stack]. Then, when the function is finished, trace.done() simply adds the ending time stamp, accumulates the time of this call to an accumulator for child times in the parent record, and returns the stack pointer to the parent's record.
When a user of the engine calls the dump() method with a file, the execution data is dumped to that file in CSV format. Records look something like this:
Expr Index Parent Start End Time Expression
@63efb167 1 -1 611441841022000 611451511487500 4711600 findAll('Encounter', ...
@791ea926 4 3 611441845767800 611441845782600 13400 'Encounter'
@791ea926 5 4 611441845771000 611441845772400 1400 'Encounter'
@6344cac3 6 3 611441845783900 611441847333500 12700 including('subject', 'diagnosis', ...
@6344cac3 7 6 611441845785400 611441847322300 1533300 including('subject', 'diagnosis', ...
@27688320 8 7 611441845786100 611441845787600 1200 'subject'
@27688320 9 8 611441845786600 611441845786900 300 'subject'
@39126a23 10 7 611441845788000 611441845789200 1000 'diagnosis'
@39126a23 11 10 611441845788600 611441845788800 200 'diagnosis'
@35c9a3be 12 7 611441845789500 611441845790400 700 'reasonReference'
@35c9a3be 13 12 611441845789900 611441845790100 200 'reasonReference'
The expression identifies the unique expression from the parse tree being evaluated. The index and parent describe the evaluation tree (which may evaluate the same expression multiple times with different inputs). The start and end time reflect the timestamps (in nanoseconds from an arbitrary base). Time represents the elapsed time (again in nanoseconds). The expression gives the full text of the evaluated expression.
You can take this data into a spreadsheet, create a pivot table report, and come up with useful reports on those parts of the expression taking the longest, as shown below.
Expression Time (ms) Executions
onServers(%Base).exists() 9145.7723 282
onServers(%Base).where(... 405.8266 2
$total | $this 47.0435 94
%ReportingPeriod.start - 1 'year' 10.4663 423
I normalized times to milliseconds so that the result would be more meaningful. This shows that the retrieval of primary data takes about 9.1 seconds, and a secondary retrieval about 0.4 seeconds, while everything else is 50ms or less.
It also demonstrates some opportunities for optimization in the FHIRPath engine. For example the 10ms taken on the the multiple evaluations of the common sub-expression:
%ReportingPeriod.start - 1 'year'
This is likely more costly than it needs to be. Since the FHIRPath engine today does nothing with common subexpression elimination, the profile shows the cost associated with re-evaluating these in the parse. That's probably an opportunity for a whole other post.
Keith