Saturday, April 20, 2019

Why Software Engineering is still an art

Software engineering isn't yet a science.  In science, you have a bunch of experimental procedures that one can describe, and processes that one can follow, and hopefully two people can reproduce the same result (unless of course we are talking about medical research experiments ;-( ).

Today, I wanted to add some processes to my build.  I'm using Maven (3.6.0), with Open JDK 11.0.2.  I wanted to run some tests over my code to evaluate quality.  Three hours later, and I'm still dealing with all the weirdness.


  1. rest-assured (a testing framework) uses an older version of JAXB because it doesn't want to force people to move to JDK 8 or later.
  2.  JAXB 2.22 isn't compatible with some of the tools I'm using (AOP and related) in Spring-Boot and elsewhere.
  3. I have an extra spring-boot starter dependency I can get rid of because I don't need it, and won't ever use it.  It got there because I was following someone else's template (it's gone now).
  4. FindBugs was replaced with SpotBugs (gotta check the dates on my references), so I wasted an hour on a tool that's no longer supported.
  5. To generate my code quality reports, I have to go clean up some javadoc in code I'm still refactoring.  I could probably just figure out how to run the quality reports in standalone, but I actually want the whole reporting pipeline to work in CI/CD (which BTW, is Linux based, even though I develop on Windoze).
  6. The maven javadoc plugin with JDK 11 doesn't work on some versions, but if I upgrade to the latest, maybe it will work, because a bug fix was backported to JDK 11.0.3
  7. And even then, the modules change still needs a couple of workarounds.
In the summers during college, I worked in construction with my father.  Imagine, if in building the forms for the fountain in the center of the lobby (pictured to the right), I could only get rebar from one particular suppler that would work with the holes in the forms.  And to drill the holes, I had to go to the hardware store to purchase a special brand of drill.  Which I would then buy an adapter for, and take part of it apart in a way that was documented by one guy somebody on the job-site knew, so that I could install the adapter to use the special drill bit.  And then we had to order our concrete in a special mix from someone who had lime that was recently mined from one particular site, because the previous batch had some weird contaminates that would only affect our job site.

Yeah, that's not what I had to do, and it came out great.

Yet, that's basically exactly what I feel like I'm doing some days when I'm NOT writing code.  We've got tools to run tools to build tools to build components to build systems that can be combined in ways that can do astonishing stuff.  But, building it isn't yet a science.

Why is this so hard?  Why can't we apply the same techniques that were used in manufacturing (Toyota was cited)?  As a friend of mine once said.  In software, there's simply more moving parts (more than a billion).  That's about a handful of magnitudes more.


   Keith

Tuesday, April 16, 2019

Juggling FHIR Versions in HAPI

It happens every time.  You target one version of FHIR, and it turns out that someone needs to work with a newer or older (but definately different) version.  It's only about 35 changes that you have to make, but through thousands of lines of code.  What if you could automate this?

Well, I've actually done something like that using some Java static analysis tools, but I have a quicker way to handle that for now.

Here's what I did instead:

I'm using the Spring Boot launcher with some customizations.  I added three filter beans to my launcher.  Let's just assume that my server handles the path /fhir/* (it's actually configurable).

  1. A filter registration bean which registers a filter for /fhir/dstu2/* and effectively forwards content from it converted from DSTU2 (HL7) to the servers version, and converts the servers response back to DSTU2.
  2. Another filter registration bean which registers a filter for /fhir/stu3/* and effectively forwards content from it converted from STU3 to the servers version, and converts the servers response back to STU3.
  3. Another filter registration bean which registers a filter for /fhir/r4/* and effectively forwards content from it converted from R4 to the servers version, and converts the servers response back to R4.
These are J2EE Servlet Filters rather than HAPI FHIR Interceptors, b/c they really need to be right now. HAPI servers aren't really all that happy about being multi-version compliant, although I'd kinda prefer it if I could get HAPI to let me intercept a bit better so that I could convert them in Java rather than pay the serialization costs in and out.

In addition to converting content, the filters also handle certain HttpServlet APIs a little bit differently.  There are two key places where you need to adjust:

  1. When Content-Type is read from the request or set on the response, you have to translate fhir+xml or fhir+json to xml+fhir or json+fhir and vice versa for certain version pairs.  DSTU2 used the "broken" xml+fhir, json+fhir mime types, and this was fixed in STU3 and later.
  2. You need to turn off gzip compression performed by HAPI, unless you are happy writing a GZip decoder for the output stream (it's simple enough, but more work than you want to take on at first).
Your input stream converter should probably be smart and not try to read on HEAD, GET, OPTIONS or DELETE methods (because they have no body), and there won't be anything to translate.  However, for PUT, POST, and PATCH, it should.

Binary could a be a bit weird, I don't have anything that handles creates on Binary resources, and they WOULD almost certainly require special handling, I simply don't know if HAPI has that special handling built in.  It certainly does for output, which has made my life a lot easier for some custom APIs (I simply return a parameter of type Binary, with mimetype of application/json to get an arbitrary non-FHIR formatted API output), but as I said, I've not looked into the input side.

This is going to make my HL7 V2 Converter FHIR Connectathon testing a lot easier in a couple of weeks, because O&O (and I) are eventually targeting R4, but when I first started on this project, R4 wasn't yet available, so I started in DSTU2, and like I said, it might be 35 changes, but against thousands of lines of code?  I'm not ready for that all-nighter at the moment.

It's cheap but not free.  These filters cost in serialization time in and out (adding about 300ms of time just for the conformance resource), but it is surely a lot quicker way towards handling a new (or old) version for FHIR for which there are already HAPI FHIR Converters, and it at least gets you to a point where you can do integration tests with code that need it while you make the conversion.  This took about a day and a half to code up and test.  I'd probably still be at a DSTU2 to R4 conversion for the rest of the week on the 5K lines or so that I need to handle V2 to FHIR conversion.

   Keith



Friday, April 12, 2019

Multiplatform Builds

I'm writing this down so I won't ever again forget, in using a Dev/Build environment pair that is Windows/Unix, I remember to plan for the following:

  1. Unix likes \n, Windows \r\n for line endings.  Any file comparisons should ignore differences in line endings.
  2. Unix cares about case in filenames, Windows not so much.  Use lowercase filenames for everything if you can.
  3. Also, if you are generating timestamps and not using UTC when you output them, be sure that your development code runs tests in the same time zone as your build machine.
I'm sure there's more, but these are the key ones to remember.

   Keith

P.S.  I think this is a start of a new series, on Duh moments.


Wednesday, April 10, 2019

V2-to-FHIR on GitHub

The tooling subgroup of the V2 to FHIR project met a minor milestone today, creating the code repository for V2 to FHIR tools in HL7's Github.  If you want to become a contributor to this team, let me know, either here or via e-mail (see the send-me an email link to the right).

Our first call for contributions are sample messages including ADT, MDM, ORU, SIU and VXU messages from any HL7 V2 version in either ER7 (pipes and hats) or XML formats.  We'll be using these samples to test various tools to run the V2 to FHIR conversion processes in different tools in the May V2-to-FHIR tooling track.  There will be more information about that track provided on the O&O V2-to-FHIR tooling call on Wednesday April 24th at 3pm EDT

We are looking for real world testing data, rather than simple sample messages, with the kind of variation we'd expect to see in the wild.  If you have messages that you've used for testing your Version 2 interfaces, test messages for validating interfaces, et cetera, and want to contribute, we'd appreciate your sending them along.  You can either become a contributor, or send me a link to your zip file, or send me an e-mail with your sample messages, and I'll work on getting them into the repo.

No PHI please.  Yes, we are looking for real world data, but no, we don't want real world patient identities in here.  I know you know the reasons why, but I probably should say it anyway.

In contributing this data, you will be granting HL7 the rights to use this data for the V2 to FHIR project, just as you would with any other contribution you make to an HL7 project.


EHRs are ACID, HIEs are BASE

Phenolphthalein in FlaskI was talking about clinical data repositories, HIEs and EHRs with a colleague this morning.  One of the observations that I made was that in the EHR world, and some of the CDR world, folks are still operating in a transactional model, whereas most HIEs use data in an analytic fashion (although still often with transactional requirements).  There are differences in the way you manage and use transactional and analytical data.


Think about this.  When you ask an HIE for a document for a patient, are you trying to make a business (in this case, a health related care) decision?  Yep.  Is your use of this information part of the day-to-day operations where you need transactional controls?  Probably not, though you might think you want up to the minute data.

Arguably, HIEs aren't in the business of providing "up to the minute data".  Instead, they are in the business of providing "most recent" data within a certain reasonable time frame.  So, if the data is basically available, and eventually consistent within say, an hour of being changed, that's probably sufficient.  This is BASE: Basically Available, Soft (possibly changing) state, with Eventual consistency.

On the other hand, when you use an EHR, do you need transactional controls?  Probably, because at the very least you want two people who are looking at the same record in a care setting to be aware of the most current state of the data.  In this case, you need Atomic, Consistent, isolated as well as Durably persisted changes.  This is ACID.

BASE scales well in the cloud with NoSQL architectures. ACID not so much.  There are a lot of good articles on the web describing the differences between ACID and BASE (this is a pretty basic one), but you can find many more.  If you haven't spent any time in this space, it's worth digging around.

   Keith



Friday, April 5, 2019

Find your Informatics mentor at IHE or HL7

I was interviewed yesterday by a college student as part of one of her student projects.  One of the questions I was asked was: What would be your one piece of advice for a graduating student entering your field?

I told her that it would depend (isn't that always the answer?), and that my answer for her would be different than my general answer (because she's already doing what I would have advised others).

My general answer is to find an group external to school or work related to her profession to volunteer in, either  a profession association or a body like IHE or HL7.  I explained that these organizations already attract the best talent from industry (because company's usually send their top tier people to these organizations).  So, by spending time with them, she'll get insight from the best people in the industry.

Organizations like this also have another characteristic, which is that they are already geared up to adopt and mentor new members.  I think this mostly might be a result of the fact that they already have more work than they can reasonably accomplish, and having a new victim member to help them is something that they are naturally supportive of, and as a result, also naturally supportive of the new member.  It's an environment that's just set up to provide mentoring.

There are days when I'm actually quite jealous of people who get to do this earlier in their career than I did.  Participating in IHE and HL7 has given me, and many others quite a boost in our careers, and the earlier that acceleration kicks in, the longer it has to effect your career velocity.  In her case, I'm especially jealous, as she's been working in this space since middle school!

In any case, if you are a "newly" minted informaticist, health IT software engineer, or just a late starter like me, and want to give your career a boost, you can't go wrong by participating in organizations like IHE, HL7, AMIA or other professional society or organization.

   Keith

connectathon

Tuesday, April 2, 2019

How does Interfacing Work

This post is part of an ongoing series of posts dealing with V2 (and other models) to FHIR (and other models) which I'm labeling V2toFHIR (because that's what I'm using these ideas for right now).

I've had to think a lot about how interfaces are created and developed as I work on HL7 Version 2 to FHIR Conversion.  The process of creating an interface that builds on the existence of one or more existing interfaces is a mapping process.

What you are mapping are concepts in one space (the source interface or interfaces) to concepts in a second space, the target interface.  Each of these spaces is represents an information model.  The concepts in these interface models are described in some syntax that has meaning in the respective model space.  For example, one could use XPath syntax to represent concepts in an XML Model, FHIR Path in FHIR models, and for V2, something like HAPI V2's Terser location spec.

Declaratively, types in the source model map to one or more types in the destination model, and the way that they map depends in part on context.  Sure, ST in V2 maps to String in FHIR, but so does ID in certain cases, except when it actual maps to Code.

So, if I've already narrowed my problem down to how to map from CWE to Coding, I really don't need to worry much about those cases where I'd want to map an ST to some sort of FHIR id type, because, well, it's just not part of my current scope or context.

Thinking about mapping this way makes it easier to make declarative mappings, which is extremely valuable.  Declarations are the data that you need to get something done, rather than the process by which you actually do it, which means that you can have multiple implementation mechanism.  Want to translate your mapping into FHIR Mapping Language?  The declarations enable you to do that.

But first you have to have a model to operationalize the mappings.  Here's the model I'm working with right now:

Prerequisites


  • An object to transform (e.g., a message or document instance, or a portion thereof).
  • A source model for that object that has a location syntax that can unique identify elements in the model, from any type in that model (in other words, some form of relative location syntax or path).
  • A target model that you want to transform to.
  • A location syntax for the target model.
  • A set of mappings M (source -> target) which may for each mapping have dependencies (preconditions which must be true), and additional products of the mapping (other things that it can produce but which aren't the primary concepts of interest).


Dependencies

Dependencies let you do things like make a mapping conditional on some structure or substructure in the source model.  For example, a problem I commonly encounter is that OBR is often missing relevant dates associated with a report (even though it should be present in an ORU_R01, the reality is that it often is not).  My choices are to not map that message, or come up somehow with a value that is close enough to reality.  So, when OBR-7 or OBR-8 is missing, my goto field is often MSH-7.  So, how would I express this mapping?

What I'd say in this case is that MSH-7 maps to DiagnosticReport.issued, when OBR-7 is missing and OBR-8 is missing.  So, this mapping is dependent on values of OBR-7 and/or OBR-8.

Products

Products let you add information to the mapping that is either based on knowledge or configuration.  HL7 V2 messages use a lot of tables, but the system URLs required by FHIR aren't contained anywhere at all in the message (even though they are going to be known beforehand).  So, when I want to map OBR-24 (Diagnostic Service Section Identifier) to DiagnosticReport.category, I can supply the mapping by saying OBR-24 -> DiagnosticReport.category.coding.code and that it also produces DiagnosticReport.category.coding.system with a value of http://hl7.org/fhir/v2/0074.


Mapping Process Model

So now that you understand those little details, how does mapping actually work?  Well, you navigate through the source model by traversing the hierarchy in order.  At each hierarchical level, you express your current location as a hierarchical path.  Then you look at the path and see if you have any mapping rules that match it, starting first with the whole path, and then on to right subpaths.

ALL matching rules are fired (unlike XSLT, which prioritizes matches via conflict resolution rules).  I haven't found a case where I need to address conflict resolution yet, and if I do, I'd rather that the resolution be explicit (in fact, you can already do explicit resolution using dependencies).

If there's a match, then the right hand side of the rule says what concept should be produced.  There can only be a match when the position in the source model matches the concept that you are mapping from, and there exists an equivalent target concept in the model that you are mapping to.  In my particular example: Presuming that DiagnosticReport was already in context (possibly because I said to create it in the current context on seeing an ORU_R01 message type), then DiagnosticReport.category would be created.

At some point, you reach an atomic level with some very basic data types (string, date, and number) in both the source and target models.  For this, there are some built-in rules that handle copying values.

Let's look at our OBR-24 example a bit deeper.  OBR-24 is basically the ID type.  So, moving down the hierarchy, you'll reach ID.  In my own mappings, I have another mapping that says ID -> Coding.code.value.  This rule would get triggered a lot, except that for it to be triggered, there needs to be Coding.code already in my mapping context.  In this particular case, there is, because it was just created previously in the rule that handled OBR-24.  But if there wasn't, this mapping rule wouldn't be triggered.

When I've finished travering OBR-24 and move on to OBR-25, I "pop" context, and now that coding is no longer relevant, and I can start dealing with DiagnosticReport.status.

The basic representation of the mappings are FHIR Concept maps (as I've mentioned in previous posts in this series).



Clarified bullet point one above thanks to Ed VanBaak.