Convert your FHIR JSON -> XML and back here. The CDA Book is sometimes listed for Kindle here and it is also SHIPPING from Amazon! See here for Errata.

Friday, November 12, 2010

Frustrated by Lack of Standards Support in IE...Again

Recently I posted on this blog about Self-Displaying CDA.  I'll be writing more about that next week.  An equal challenge exists for Level 1 wrapped content using the CDA standard.  Numerous people have tried to create a simple XSLT transform that would allow a CDA document containing a nonXMLBody to be displayed.  Unless that body happens to be text/plain, it's been almost impossible to do, especially if Base 64 encoded.

I did manage to deal with base 64 encoded text/plain in an XSLT stylesheet that called out to Java using Xalan pretty easily.  The trick is to call out to the URL decoding function passing in the content of the ‹cda:text› element.  The same technique should also work in .Net with any XSLT processer than can call out to a .Net object that decodes base64 content.  You should also be able to do it with an XSLT processor that can support JavaScript (see here for a decoder function)

  Given textual data, Base64 decode it.  This template is used to 
  Base 64 decode information found in XDS-SD (text) format into 
  a string.
‹xsl:template name="BASE64Decode"›
  ‹!-- The data to base 64 decode. --› 
  ‹xsl:param name="data"/› 
  ‹!-- Force it to be a string (just in case) --› 
  ‹xsl:variable name="theData" select="$data))"
  ‹!-- Create a new Base 64 decoder --› 
  ‹xsl:variable name="decoder" select=""/›
  ‹!-- Get the decoded bytes --› 
  ‹xsl:variable name="theBytes" select="java:decode($decoder, $theData)"/›
  ‹!-- Turn it into a String (using default character set) --› 
  ‹xsl:variable name="result" select=""/›
  ‹!-- Return the string --› 
  ‹xsl:value-of select='$result'/›
But base-64 encoded PDF in XSLT eluded me for quite some time.  The key problem was being able to convert the content and have it be read by the browser.  It just cannot be done because the image or object data has to come from somewhere else.  HTML doesn't support embedding of the content directly, which is what I'd have to do with an XSLT.    I knew that I could create a protocol intepreter for IE that would solve the problem by including the data in a URL, but never had the time to build it.  Of course, the best URL protocol to support would be the data: URI specified in RFC 2397.

IE 8 added data: URI support for images and objects, but only for image formats, with a 32K limit on the size of the URI content.  I thought I'd try it out this evening after my discussion today, especially since I'd loaded up a number of browsers on my machine for the self-displaying CDA project.

Here's the template for the stylesheet that should work:

‹xsl:template match="cda:nonXMLBody"›
  ‹xsl:variable name='url'›
    ‹xsl:value-of select='cda:text/@mediaType'/›
    ‹xsl:if test='cda:text/@representation="B64"'›;base64‹/xsl:if›
    ‹xsl:value-of select='n1:text'/›
  ‹object width='600px' height='800px' data='{$url}'›‹/object›

I've been able to verify that in fact it does work in Windows for Opera and Safari, but fails for Firefox, Chrome (Chrome doesn't even run the XSL transform!) and Internet Explorer.  The really frustrating part of this is that Microsoft crippled the support for the data: URL in IE 8.0.  It only works for image formats, and of course, PDF is not an image format, so it doesn't work.  I can understand why Microsoft did this, becase the data: URL is a security risk.  The key problem is that the base 64 encoded data can contain scripts (even in PDF) that might pass through security filters that don't check the content of the URL, and that can lead to a number of new attacks on the browser.

Internet Explorer 9.0 is currently in Beta, and Microsoft has increased the size of the URL to address other issues, and allowed its use in other objects (including SCRIPT), but doesn't seem to have any plans to support its use for non-image content in the OBJECT tag.  I hope that will change.  Allowing data: URL for the SCRIPT tag is at least as big a security risk as allowing application/pdf content in an OBJECT tag.  From a technical perspective, they've already got most of the code to support it in the product, they just need to find a way to address the security risks.  An adequate solution might be to A) allow users / system administrators to determine the content types that could be transmitted using the data: URI format, and B) invoke anti-virus/spyware scanners on the data: URI content before allowing it to be used, somewhat the same way that they can be invoked before accessing a PDF document over the web.

Now if someone brilliant had happened to write a plugable protocol handler for IE using the data URI format, you should be able to make it work for versions of IE that didn't already support it, and you should be able to hack it (by changing the URL scheme to something like x-data: ) to work in IE 8.  It's a pretty straightforward engineering job.  My .Net chops aren't up to it yet (I like Java).  A good protocol interpreter should be able to run in IE versions from about 6 on.

If the browser vendors had implemented the standards and specifications that were available more than a decade ago (the data URI was defined in 98, XSLT in 99) we'd not even have to worry about it.  But they haven't and so we have to worry about it. At least now you know how to make it work, and have most of the pieces you need, including free code.  I'll leave it up to someone else to make it happen, because I have another project I'm still struggling with right now.

It seems that IE 8 still lacks support for applying CSS table display formats to XML in IE 8, even though they claim to support it for HTML (which I haven't verified), and so I'm looking for ways to hack around it.  Like I said, I'll have more to report on that project next week.  It was approved by the SSD steering division in HL7 this week, and I may just finish the technical details before it gets TSC approval in the coming weeks.  Wouldn't that be a hoot.