Pages

Monday, September 14, 2015

Code Generators: A love hate relationship

Some of my favorite projects have involved code generators.  I once wrote an LL(1) parser generator which processed SGML DTDs with some added stuff to make some great looking output from SGML.  Another time I took the XHTML from the IHE Wiki and ran it through a transform which then generated code which would output conforming IHE PCC Sections according to the PCC Technical framework.

My present work involves transforming from CDA and CCDA to FHIR and back.  To do that, I'm annotating the output from Trifolia with statements that express how to do the mapping.  Then my code generator essentially writes the transform from the CCDA entries to the appropriate FHIR output (Document, sections and narrative transforms I wrote by hand.  These are the scaffolding upon which my code generator operates from, and I didn't need to have that build automated.)

What I love about code generators is that once you get them right, the code they produce is inevitably correct.  And since the code generator can produce a LOT of code from a large input, this can be incredibly valuable from a software development perspective.

What I hate about code generators is that "ONCE you get them right" part.  Doing that is tricky, and a small change can also be very damaging.  Refactoring a code generator to get yourself out of a design dead-end is like threading a maze blind-folded sometimes.  And the wrong fix on a tiny bug can break TONs of code. However, once you finally get it right, the code is often nearly rock solid. After all, what the computer does really well is repetition.

Code generators are also especially difficult when the language you are writing your code generator is the same as what it will be producing.  Going from XSLT to JavaScript (or vice versa) is a heck of a lot easier than XSLT to XSLT or JavaScript to JavaScript. One challenge is that the levels of escaping you have to go through to ensure the correct output syntax are a pain.  Another is that you have to keep two different execution contexts in your head: The one you are writing your code generator in, and the one that it is writing code in.  I find myself wondering why a variable I clearly declared in one context doesn't exist in the other until I start looking in the right place.  That's not a problem when I have execution contexts in two different languages (var x =0; is so much different from <variable name="x">0<variable>).

What I also love about code generators though, is the challenge they provide, and the satisfaction that they do a tremendous amount of work that I couldn't have done by coding manually.

One of the great things about FHIR is its almost recursive relationship with itself (sort of like writing an XSLT to generate an XSLT, something only a true Geek could enjoy).  The fact that Conformance OperationDefinition, and SearchParameter resources exist to define how a FHIR server works is very comforting to me.  It means that I can stay within the same context when trying to do several things at once (as is often the case with Interoperability).  However, I think my favorite two "recursive" resources in FHIR are ImplementationGuide and TestScript.

Windows NT developers at Microsoft years ago, used to talk about "eating your own dog-food", referring to the fact that they had to use the OS they were building to build the OS that they were building, as they built it.  Well, FHIR is doing the same thing, and starting to develop the necessary resources to build the resources that will build the standard.

It's something that only a Geek could love from an aesthetic viewpoint.  From an outcomes viewpoint, I think the implementers of FHIR and the users of systems that implement it will love it too.  Because once you get it RIGHT, the code rocks.

No comments:

Post a Comment