Thursday, June 16, 2022

My Comments on the SANER Rule

Thank you for the opportunity to comment on the CMS 2023 Prospective Payment Rule.  I won't spend a lot of time on introductions.  I'm a highly trained Health IT subject matter expert, and have been employed in the Health IT field for multiple decades.  If you want to learn more, simply go back to the source for these comments and learn more about me. Please note, these are my personal comments on the IPPS rule, not those of my employer or any other organization you might presume I represent.

I am also publicly sharing these comments for anyone else who may have forgotten that these are due tomorrow.

My comments are focused on section M. Condition of Participation (CoP) Requirements for Hospitals and CAHs To Report Data Elements To Address Any Future Pandemics and Epidemics as Determined by the Secretary.

  1. In summary, CMS proposes "to require hospitals and CAHs to report specific data elements to the CDC's National Health Safety Network (NHSN), or other CDC-supported surveillance systems, as determined by the Secretary. 

    I strongly support having the NHSN continue this work.  NHSN has been at the forefront of surveillance and reporting efforts for decades, and that expertise should not be lost in addressing the needs of this sort of reporting or the already established infrastructure supporting tens of thousands of existing connections to reporting facilities and the staff and processes necessary to support and maintain those connections. I’d hate to see NHSN expertise lost in this effort. If it were to go elsewhere within CDC, I would strongly recommend a restructuring based on the existing core NHSN expertise, however, given NHSNs leadership in looking for and adopting new ways to communicate this information (see link below to NHSNLink), that seems unnecessary.  It was already very disruptive to transition this reporting away from NHSN in the first COVID-19 summer under the previous administration.

  2. Section 3. Summary of Costs and Benefits addresses the costs for manual reporting efforts the collected data.  Many facilities will want to automate this process if they have the technical capacity to do so, which has an alternative cost structure. The rationale is that not only is nursing staff time a cost to the facility, but that it also represents anywhere between two weeks (for weekly reporting) and a half a year of nursing time devoted to non-patient care activities, and thus lost revenue.  We already know that during a pandemic emergency, qualified staff are in short supply, so it is not just a matter of making up for the lost staff time.  This represents an opportunity cost that is hard to quantify, but should be a consideration.  Facilities recognize this, and thus many opt to automate these reporting solutions.

    Consider that many hospitals operate at very thin margins.  Published statistics show the average cost of a hospital bed day to be around $2400.  Considering nurse staffing norms between 3 and 5 beds to a nurse, that's between 3 (at weekly reporting) and 42 (for daily reporting) days of lost nursing time, in dollars ~$10,000 for weekly or ~$400,000 for daily reporting of revenue lost due to the nurse not able to care for a patient.

    The data being collection for automating this reporting would not generally come from a single system. To automate requires data collected from between 3 and 5 different hospital systems. Clinical data might come from the EHR, bed capacity from a bed management system, PPE from inventory systems, and medication and vaccination inventory from pharmacy information.  Some facilities might use a combination of manual and automated solutions because they may not have (or need) all of these different systems. 

    Automated solutions are essentially interfaces (a term of art in Health IT) between different health information systems, and NHSN or other CDC entity, but since all reporting is consolidated their must also be a central system that consolidates the data into a single report.  The interfaces themselves require either IT staff effort (or may be purchased from IT vendors), and also require deployment and maintenance to either existing or new hardware.  There are several published estimates for the INITIAL costs for the development of various interfaces, ranging from $3000 to $25,000 depending on complexity and features, as well as the maintenance costs, and some portion of the interface cost may need to go into additional software licenses and infrastructure (e.g., computers) to run those interfaces, as well as IT staff to manage and support them.  There are from 4 to 6 possible interfaces that may be necessary to automate this reporting (thus returning nursing time to the care of patients).  The cost of development for the interface (including first year of operations), could be as much as $250K depending on the approaches used by the facility and again, features and complexity.  Maintenance costs for deployed software can generally be estimated as being somewhere between 10 and 25% of the initial software cost (and some vendors actually compute their annual maintenance prices based on the cost of the initial interface. 

    Assuming a 20% rate for maintenance (a popular number) with minimal associated facility staff time which might be absorbed into routine interface maintenance efforts, this brings the total cost of the interface to around $300K for a two year period, or $500K for a five year period, an amortized cost of around $100K per year over five years.  This effort by the way, also is opportunity loss for the facility, with respect to investing in other interoperability initiatives, such as promoting interoperability.  Just as for any other hospital based professional, there's only so many competent interface people to go around to work on these efforts.   Some of these costs may be offset by the ability of the IT staff to repurpose existing automated solutions, which might reduce these costs to $50K per year (a ballpark back of the napkin estimate) when amortized over a five year period, which would bring the two year cost down around $150K (slightly lower than the cost projected by CMS for manual reporting, but note this response argues that CMS has underrepresented the actual financial impact of the manual cost on hospitals).  This may only be feasible in larger hospitals with more advanced IT staff and capabilities, but does provide a range for the estimation.

    These arguments serve to show that the projected costs of reporting are likely underestimated, either based on the lost revenue due to inability to staff beds due to nursing resources being used for non-direct care activities, or due to somewhat higher to automate such a solution.

  3. The rule states: "For purposes of burden estimates, we do not differentiate among hospitals and CAHs as they all would complete the same data collection."

    Many CAHs have a much lower capacity to support IT innovation, as they provide essential and low margin hospital services in needed areas, and are not able to fund extensive IT departments.  Some CAHs have "a guy who comes in once a week" to work on the IT systems.  Their ability to automate is marginal, and furthermore, it's not just an RN, but probably the most senior nurse at the facility who is doing the reporting, because they are likely one of the few people with access to all of the information.  The reporting in these facilities is a team effort, 2 or 3 people gather data, the reporting nurse consolidates, validates and reports, which also increases the time it can take for reporting. CMS should strongly consider that the burden of this reporting on CAH facilities is going to be much higher just based on the nature of the ways that these facilities operate.

  4. The preamble further says: "Furthermore, we note that this estimate likely overestimates the costs associated with reporting because it assumes that all hospitals and CAHs will report manually."

    Respectfully, I disagree.  As previously noted above, a nurse is a revenue generating employee in a hospital, and the CMS estimates did not account for revenue lost due to engagement in a non-direct care activity.  Just replacing the nurse with another paid at scale does not account for the financial burden of having that person perform non-revenue generating activities.

  5. CMS notes immediately thereafter that "Efforts are underway to automate hospital & CAH reporting that have the potential to significantly decrease reporting burden and improve reliability".  References to such efforts would be very helpful to understand the potential impacts.  Some suggested references follow:

    * NHSNLink developed for NSHS and described at https://www.cdc.gov/csels/phio/exchanging-data-efficiently.html
    * Helios, an HL7 FHIR Accelerator supported by CDC and ONC at https://blog.hl7.org/hl7-launches-helios-fhir-accelerator-for-public-health
    * The HL7 FHIR Situational Awareness for Novel Epidemic Response (SANER) Implementation Guide at https://hl7.org/fhir/uv/saner/.

  6. Also note the two recommendations from the HITAC on SANER (used in all of the above listed references) found here and quoted (and with my emphasis) below:

    a. We recommend that ONC list Situational Awareness interoperability priorities in the ISA and should catalog SANER as well as related standards and IGs; ONC should via work with stakeholders on pilots and early implementation, evaluate and mature standards towards broader adoption.
    b. We recommend that ONC work with stakeholders at HHS to create aligned policy and funding mechanisms to harmonize adoption of a combined situational awareness standard that maximizes readiness and minimizes state-by-state divergence.

    I see the conditions of participation in this rule as being one piece of the aligned policy and would recommend that CMS be prepared to coordinate with other Federal agencies on other pieces (e.g. the Helios effort listed above, USCDI and USCDI+, CDC Grants in aid to the states)

  7. I applaud the efforts that CMS has made to generalize the reporting requirements and learning the lessons from COVID-19 in ways that appropriate to the variety of causes of a pandemic Public Health emergency.  It's good, but honestly more work will be needed to develop the necessary standardized templates that can be used to quickly create measures specific to each PHE, and CMS should collaborate with other existing Federal efforts (e.g., the Helios FHIR Accelerator project, and specifically the Aggregate Data project within Helios) and with ONC/USCDI and USCDI+ to develop these templates, measures and value sets as national standards.  Also, please do not forget that a Pandemic need not be related to a respiratory condition.  Consider AIDS/HIV, or MonkeyPox as examples.

  8. With "that are captured with interoperable data standards and elements", it's clear that CMS understands the need to adopt such standards and data elements in national standards such as those referenced by USCDI.

    The list following this paragraph identifies kinds of things can be captured in a terminology Value Set, maintained readily in VSAC.  Such value sets must be developed and used in measures, and should be adopted from standard vocabularies, but I would recommend that the efforts by the secretary to standardize these value shoulds should focus more on the vocabularies from which they are selected and the standard formats in which they are delivered.  The development of these value sets is more correctly delegated to professionals with relevant background and training, with appropriate governance to enable others outside of the maintaining group to submit new terms for inclusion (much as how vocabularies themselves are developed).  I would encourage CMS to ensure that their are appropriate funding mechanisms to develop and curate such value sets, and investments in VSAC to support the rapid update and easy distribution of them through interoperable solutions.

    * Disease
    * Treatment Devices
    * Vaccine
    * Theraputics (meds or biologics)
    * Co-morbidities

  9. It's unclear whether this text: "The proposed requirements of this section would apply to local, state, and national PHEs as declared by the Secretary." applies to regions under tribal jurisdiction or not.  Given the complexity of jurisdictional governance, I would have appreciated some clarity on this point.

  10. "to include medical record identifier".  Get to work on National Patient ID, please.  In the meantime, investigate existing CDC efforts in application of privacy preserving record links.  It's about time we do away with the dream of longitudinal records through patient matching and move on to the reality of having a unique identifier.  Many of the challenges reported earlier with COVID reporting in public health stem from the inability to walk longitudinally across state, jurisdiction or organizational boundaries.  If you want interoperable data that can provide a national picture, you need a way nationally to identify patients.

  11. On Burden:
    "For purposes of this burden collection, we acknowledge the unknown and the ongoing burdens that may exist even if CMS is not collecting information outside of a declared PHE. We recognize that considerations such as building and maintaining the infrastructure to support readiness are necessary to ensure compliance with this requirement. Therefore, we are soliciting comment on the burden associated with these proposed requirements given the intended flexibility provided in reducing or limiting the scope and frequency of reporting based on the state of the PHE and ongoing circumstances. We are specifically asking for comment on the potential burden associated with the proposed reporting requirements as they might relate to any differences in the public health response to one specific pathogen or infectious disease versus another that would be directly related to the declared PHE. We are also interested in public comments addressing burden estimates (and the potential differences in those estimates) for variations in the required reporting response for a local PHE versus a regional PHE versus a national PHE that might be declared by the Secretary based on the specific circumstances at the time of the declaration."

    Building and maintaining the infrastructure has burdens on Health IT Vendors, and upon their customers (Hospitals and CAHs), as well as on the supporting organizations.The IT Vendor burden is in reality a shifting of focus from vendor determined priorities to national interoperability priorities established by public policy, as their financial burden otherwise flows through and is borne by their customers as a cost of goods and services.  The financial burden on hospitals and CAHs is one which must be borne, and better borne before the next PHE rather than during.  We already well understand the challenge of building the plane while flying it given recent experiences under COVID-19, and the cost of performing such efforts during a PHE surely outweigh those of performing outside one.

    Some of this burden can be reduced by relying on natural aggregators of health information at the local, state, territorial and tribal levels.  CMS should consider enabling health information networks (as defined on ONC rules) including Health Information Exchanges, regional public health entities and supporting organizations (e.g., hospital associations and others) the opportunity to provide services to hospitals and CAHs to support reporting of the requested data to NSHN on behalf of those facilities who want to use them.  This will help by aggregating some of the common infrastructure functions under a single entity, e.g., maintaining compute, storage network and interoperability infrastructure necessary to automate reporting, enable reuse of secured connections to those HINs, and reduce the effort needed to support national reporting.

    I also note that CMS considers only pandemic public health emergencies, but would also remind CMS that other emergencies may also benefit from such a reporting infrastructure.  Forest fires, rolling blackouts, hurricanes and tornados, freezing weather in parts of the country that see it only once a decade or century, all of these emergencies can create a strain on our health infrastructure that appropriately designed system for reporting on situational awareness can address.

    As we've learned, focusing on just beds isn't enough, and reporting multiple separate measures to local, state and national authorities just trebles the burden.  We need to find, dare I say it, a SANER way to accomplish this.

Tuesday, June 7, 2022

Digital Transformation is not computerizing paper forms

This is very related to what we are trying to accomplish with CREDS, and so is sort of a followup to my previous post on CREDS, but is more broadly scoped.

It starts here with this tweet, but not really.  It actually starts with all of the work that people have had to do to mediate between the data in the EHR, and measurement of what that data actually means.

It boils down simply to how one asks questions in order to perform a measurement.

In the early days of quality measurement and/or registry reporting, it was not uncommon to see questions like this:

  • Does this patient qualify for XYZ treatment?
  • Has the patient been prescribed/given XYZ treatment?
  • Is XYZ treatment contraindicated and/or refused?
And, the reporting formats be structured in Yes/No form in some sort of XML or other format.

It's gotten to the point that some of these questions and their possible answers have been encoded in LOINC.  Now, when used for a survey or assessment instrument, that use is fine.  

But for most registry reporting or quality measurement activities, this should really be handled in a different fashion.  This is a start, and can be written in Clinical Quality Language format, or even more simply in a FHIRPath Expression.

  • Does this patient have any of the [XYZ Indications ValueSet]?
  • Has this patient been prescribed/given [XYZ Treatment ValueSet]?
  • Does this patient have [XYZ Treatment Contraindications ValueSet]?

The theme in this restructuring is: Ask the provider what they know and did, and define the logic to compute it.  But even better is to ask the provider what they know and did (and recorded), and have the quality measure reviewer actually do the compute on the quality measure.

Apply normal workflows to keep track of what is learned and done; these shouldn't be interrupted.  I can recall a case where normal workflow added a checkbox to an EHR screen just to get a clinician to acknowledge that they had reviewed and/or reconciled the medication list.

Building these value sets and the logic to evaluate them is hard.  Doing it so that it is interoperable across systems is also hard.  But honestly, the cost per provider to do this is so much less to do it once and do it well, than it is to have hundreds or thousands of systems all need to do this is much more costly, and likely to introduce differences in the compute, and variability in the reported values.

Stop asking for the answers, start asking for the existing evidence to get the answers you need consistently.  And if you cannot get the existing evidence, ask yourself why before asking that it be added to the normal workflow.

   Keith







Tuesday, May 24, 2022

How Does CREDS work?

I'm getting back to some standards writing, this time on the CREDS project.  In case you haven't heard of this one, the (b)acronym decodes to Clinical Registry Extraction and Data Submission.  The challenge that CREDS is looking to solve for Clinical Registries is a very common one.  Data needs to be collected from clinical records, packaged up, and then be sent somewhere.

There are three key questions that any implementer has to answer in this kind of effort.

  1. What data?
  2. Where is it sent?
  3. How is it formatted?
In this post, I'm going to be focused on the first question.  The next two will vary by registry, but we can generally express the answers to 2 as "To the registry that is asking for it", and in this case, CREDS is a FHIR IG, so the answer to 3 is FHIR, but it's also a little bit more complicated than that.

What Data?

The what question comes in several parts:

  1. Who has it?
  2. In what (standardized we hope) form or forms can it be found?
  3. How do we find it?
Answers to the first question for clinical registry reporters include:
  • My EHR System
  • My Old EHR System
  • My Laboratory Information System
  • My Radiology or Imaging information system (RIS or PACS)
  • An ambulance (not ambulatory) information system.  Yes, you might have to dig through transport records in a standardized format.
  • The patient's primary care or specialty provider (in their EHR, RIS, or PACS)
  • In a third party's information system
  • Through an HIE
  • Through some form of national data exchange network
  • The patient's payer (typically an insurer, or a state or nationally funded program).
The second question can varyingly be answered as:
  • FHIR, and if so, problem solved
                                            ^ (nearly, more on this later)
  • CDA
  • HL7 Version 2
  • DICOM
  • NCPDP
  • X12
  • or various others, and finally
  • the worst case Digital paper ... e-mail, pdf, printed text reports -- which I'm going to take a hard pass on for this blog post.
The third question (how do we find it in the data) is not quite so simple, and yet has one answer, at least for CREDS, and is the key to how CREDS works.  You see, CREDS relies not just on FHIR, but also FHIRPath.  FHIRPath is a declarative programming language that allows one to express a pointer to one or more pieces of data in a data model, or perform a computation based that data, or answer a yes or no question about it. It originated as a way to point into FHIR Resources to express things like the location of search parameters, or constraints on a refined model (a profile of a FHIR Resource) that must be met for a resource to comply with the profile.  For those that have been around for a while, you can see that FHIRPath looks very much like XPath, and for some constrained versions of FHIRPath, there are some very simple transformations from FHIRPath to XPath expressions.

I've been working with XPath (and XSLT which relies very heavily on it as a transformation language) before it was a W3C Standard, while it was being invented, with the inventors working in cubicles or offices right next to my own.  You can do just about anything with XPath that you can in any other programming language, though I'm not quite sure you'd necessarily want to. It takes a twisted brain to think in declarative form.  It takes a bit less twisting in FHIRPath because the creators of that language had similar experiences and already had some ideas about what worked and what doesn't.

But declarative has its own value.  The point of a declarative program is not in specifying how to get the answer in stepwise form, but rather to define what you want the outcome to be, and let the system decide how to get there.  The value add of the system is in defining how how to perform these operations efficiently.  Programmers use systems like this all of the time.  Consider SQL (also primarily a declarative language, though it has some procedural elements).

CREDS says that a registry defines a logical model describing its data, and that logical model is made available to users of the CREDS IG via an information system.  This may be nothing more than an excel spreadsheet with element names and data types, or it might be an entity relationship model expressed in a variety of standard forms.  Since CREDS is using FHIR, CREDS requires that these logical models be expressed in a FHIR format.  For that, we have the FHIR StructureDefinition.  Yes, the same creature that defines the logical model for FHIR Resources and extensions and FHIR Datatypes is also used to define logical models.  If you want to find a logical model for a registry, you have more than 2 dozen search parameters on StructureDefinition you can use to find it.

FHIRPath, interestingly enough, also is designed to work with logical models, although it is principally intended for use with FHIR, it is NOT limited to that model.  I'll show why that's important in just a bit.

Each element in a FHIR logical model (StructureDefinition resource) can be mapped to one or more other standard formats. The language in which this mapping is expressed is left unrestricted in the base FHIR Standard, though it should have a mime type (which is defined by FHIR to be text/fhirpath [and is recognized by IANA as such).  CREDS says that you must use FHIRPath in your mappings.  Why?  Because with those mappings, you can now locate the actual data element in the mapped to standard that reflects the intent of the author of the definition for the registry data model ... in a computable manner.

Furthermore, to successfully use CREDS, the mapping identifiers used in the StructureDefinition resource for the logical model must also be bound to a separate logical model that describes the mapped-to standard.  And that logical must be defined using StructureDefinition resources.  This is important because it makes it possible to automate queries to extract the necessary data from the larger assets that contain them (CDA Documents, V2 messages, et cetera).

So, now to the meet of how CREDS actually works:
  1. Download the Registry's logical model from a FHIR Server by querying for the StructureDefinition resource(s) you need.
  2. For an identified patient, and for each system that might have data, collect the relevant data for that patient from the system.
    1. By first mapping the patient identity you know to the patient identity in the system that you are going to query.
    2. And then query for and extract the relevant data assets (FHIR Resources, CDA documents, or V2 messages).
      1. To query for FHIR Resources in this example, it is enough to request Patient/$everything, and let the FHIRPath mapping perform filtering.  This is a naive implementation, you could be smarter, and CREDS will have some guidance both about writing mappings and how to use those mappings to get to the document you need.  A key assumption in CREDS is that the mapping is explicit, it does not assume that anything other than patient identity and date are available as pre-established contextual cues to extract the data.
      2. To query for CDA documents, well, you have a couple of options, but the one I'd start with is already FHIR enabled, and is basically the equivalent MHD query for all patient documents for a given time range for the patient.  And if you happen to need to query to an XDS/XCA based registry / repository (as might be the case for QHINs under TEFCA, or most national networks already today), there's a way to bridge from MHD to XDS/XCA queries. I built one in personally in 2016, I worked with a team that shipped another one to production a few years later.  It's baked into the design.  After all, MHD started out with the XDSEntry resource.  Others have done the same, including Grahame Grieve.
      3. V2?  CDA showed the way.  While there is NO "standard" way to do the same for V2 messages, I've also built (and shipped) a FHIR based query modeled after MHD to an HL7 V2 message repository.  This is not rocket surgery, it's more like brain candy.
      4. Got another format (e.g., NCPDP, X12)?  Same thing, different format.
      5. Finally, for each element in the Registry's logical model, apply the appropriate FHIRPath expressions over the extracted resources to collect the information in the logical model.
  3. Having collected the data to populate the registry's logical model, you could just send it in logical model form and let the registry take it, but that's no longer FHIR, it's just FHIR-like, or perhaps even FHIR-lite.  So ...
  4. A smart registry will define their logical model in (as close to as possible) a FHIR format.  That's a small lift, but certainly worthwhile.  Registries which define their submission models in FHIR format don't need the final CREDS step of transforming the data.
  5. We sort of cheat on the very last step for registries which don't do that work, and say, the submission to the registry goes only so far as FHIR as a standard, so if the registry logical model isn't already written in FHIR (and the various regional standards such as the HL7 FHIR US-Core or USCDI (or USCDI+), then they must also supply a computable transformation using the FHIR StructureMap to convert information structured in their logical model format to the FHIR Bundle.  CREDS won't go into much detail there, and it really need not do so.  There are implementations that can apply a structure map to perform transformations from one logical model to another.
That's enough on CREDS for tonight and it's NOT what my project manager wanted me to do, but it's really about getting my thoughts down in pencil before I whip out my pen.  

Surely there's more to be written, I have a list.  Here's a few of the topics I plan on taking up shortly:
  1. How to go from a mapping to a queryable repository in FHIR.  In this, I'll answer the question about how a registry can specify mappings in a way that makes it possible for different institutions to query different repositories or networks for assets to collect.
  2. Optimizing queries. The principles of writing FHIRPath expressions in a format that enables users to distinguish between the initial query to send to an external part and the filter to apply after asset retrieval to collect the data, and some thoughts on how to merge queries for the same kind of asset (e.g., a FHIR Resource type, a CDA document or a V2 message).  I suspect that the first implementation I work on in any way will be naive (Patient/$everything), but I'm certain I can do better.  FluentQuery would be a big help here, but at the moment, I've got the only implementation I know of from the SANER Project.
  3. FHIRPath for CDA. If I get to this one in 2022, I will declare the year a grand success, because it means I will be writing an implementation or better yet, that someone else beat me to it.
Personally, I think CREDS project will result in a visionary IG, but it's goal is NOT to boil the Atlantic Ocean, or even Lake Ontario.  I'll settle for Lake Kerr* for now, understanding that if I but turn up the FHIR, I could use it take on anything.

For those that don't know, the main inspiration for how we are doing things in CREDS is based on the work I did over the last two years with the SANER Project.  For that project, we needed to do two things:
  1. Extract data from patient medical record in various places, and
  2. Count the things that we extracted.
CREDS is just refining how to do the first part, and not just for data that exists in FHIR.  Because lord knows, it's not all in FHIR yet, and it will probably remain that way through my lifetime.

     Keith

*  Lake Kerr is within sight of where I wrote Happy Tears, and I walked by it with my mother last week a few days after her successful cancer surgery.

Thursday, May 19, 2022

Happy Tears

This stream blew up on twitter, so I thought I'd share it here as well (with edits since I don't have to worry about length limits).


Wednesday, April 20, 2022

Spoiled Rotten and Not Knowing It

When I was young and naive (about 20 years or so ago), I was responsible for leading up part of a development team who worked on a XML database product that ran on the early web and which could serve up XML content or HTML pages in response to requests.  One of the people on our staff had a masters (in mathematics or computer science, I cannot recall which), and was also working on his doctorate in the same.

Every Monday I would get a report about the product performance he wrote, and the changes in performance between this week and the last.  It was filled with estimates on the number N of requests needed of each type to verify performance improvements, and mathematics behind it, and then the design of the performance tests ... experiments he called them.  And they were carried out with that kind of experimental accuracy you'd expect from a scientist.  The results, the relative improvements (or dis-improvements), error bars, and the expected P values and all the rest the rest were in the summary report, about 10 pages, every week for weeks in a row, on my desk, Monday morning from the past week's performance run.

He set my expectations for performance testing in a way that would leave me almost forever disappointed for the rest of my career.  I was spoiled rotten for months and never knew it.  I miss him, and I wonder where he went.  He was only with us for a short period of time, I think on an internship.  If I was team captain, he was my Scotty, not just an engineer, but a true scientist.

That company was my introduction to standards development (in W3C), and gracefully self-disrupted at the outset of the .com bust in 2001, which led me to my career in healthcare, and eventually Health IT standards.





Thursday, March 24, 2022

SMEs know the right Keywords

Schmee
Somewhere in the course of my career, I spent a good bit of time working on "Information Retrieval" projects.  I think my first one was a search enhancer while working for a linguistic software company, but in reality, much of my earlier work experience was also related to building searchable repositories for information.  My Capstone project was also an Information Retrieval project.

And even before that, I was an expert at finding things in books because I could remember WHICH book had the answer, and how to find the right page using the index.

A recent Facebook post I ran across asks: "Badly explain your day job".  I should have responded with the image above.

I suppose part of it is translating questions into Google and Stacktrace queries, but it's not just "JFGI".  A lot of the time it's translating the querant's request into terms that Subject Matter Experts (SMEs) in a particular art would use to answer the question.  I'm not an expert so much of in terms of what I know, so much as in what I know how to find out.  Yes, I have encyclopedic memory of the CDA R-MIM, and a few other things, but I don't keep that on the top of my memory, instead, I know where to find what I need.

The CDA Book came about so that I'd have a list of those things that I need for my work in print form by my desk (and yours).  This blog came about for some of the same reasons.  I know I wrote something about that, where the heck did I put it.

If Web 2.0 is the Semantic web, and Web 3.0 is some Block chain related thing, Web 4.0 should be the translator of simple questions into expert answers, and if we get there, we will also have solved the Touring problem.  Because to do what I do as a significant part of my day job, is to get INSIDE the head of the querent, figure out what they are really trying to do or understand, and then, get the expert answer and translate it back into something that they'll understand ... the Freshman lecture version if possible.



Friday, February 25, 2022

Debugging Network Communications - For the Rest of the World


In one of my very early forays into IHE standards implementation, I had to implement TLS version 1.0 shortly after it became an IETF RFC, in Java, using Tomcat.  Thus began the ATNA FAQ.  Thiat document is now 15 years old, and is still being used (at least by me).  I started writing these kinds of things down so I don't forget them, which eventually resulted in this blog.

I'm still debugging TLS connections, and I have to admit that it's become a particular art form, but not so much as in finding out what when wrong, and figuring out the solution, but rather in explaining it to someone who is NOT versed in the ingredients of TLS connection sausages so that can a) find the right person to fix it, and b) communicate what needs to be fixed without c) further intervention from me.  That's high art.

I have a highly evolved tool set including:

  • DNS, Ping and TraceRt
  • WireShark
  • OpenSsl
  • JVM debugging arguments
Here's the problems these tools help me identify.  Each requires appropriate communication..
  1. My computer cannot get a good TCP/IP address for the supplied hostname.
    1. Either DNS is down somewhere, or the new DNS record is still propagating, or it simply does not exist.
  2. My computer cannot connect to the TCP/IP address that DNS gave it.
    1. The DNS change for your server may still be propagating
    2. That IP address is not connected to the Internet, is not within my reachable networks
      1. The sending computer sends a SYN, but no ACK is ever returned.
        1. Ping returns no response to that IP (may be blocked by a firewall).
    3. The firewall doesn't like the computer it is coming from.
      1. Same symptoms as above.  Firewalls and application gateways (Firewalls on steroids) and other interfacing technologies (e.g., AWS Lambdas or Azure Functions, gateways that have been weaned from Steroids) can mimic just about anything, but most often, they just hide your computer from mine.
    4. The computer is actively NOT listening to connections from that address.
      1. SYN goes out, but a RST comes back.  In other words, I'm listening to communications, just not to THAT port.
  3. The computers can connect (three way TCP SYN, ACK, SYN-ACK completed), but we cannot make a TLS connection.
    1. My client sent your computer a TLS version it didn't like. You server needs to enable support for TLS Version (1.2 or higher).
    2. My client sent your computer a set of encryption protocols it really doesn't care for.  Your server needs to enable the following encryption protocols ...  (RSA or DCHE Galois Cipher Mode[GCM]).  Cyclic block cyphers (CBC) are no longer considered to be secure.
    3. Your server sent my system a certificate it didn't like, it was:
      1. Expired, please renew it
      2. Revoked,  please replace it with a new certificate
      3. Not secure enough ... please use a certificate supporting at cipher strength of at least (2048 bits).
      4. Not signed by an entity trusted on my end
        1. Our system doesn't accept self-signed certificates, please get your certificate signed by (a commonly trusted CA, or a CA of my choosing)
      5. Prepared using an algorithm my server doesn't support.  Please use an certificate signed using (the RSA algorithm).
      6. For a server going by a different name (this one is rare.  Hostname matching is important in many circumstances, but in cases where the server that is supplying the certificate is three layers deep behind other network infrastructure, it has no clue, and so is often disabled.
    4. Your server required a client certificate from my system that it doesn't have.
      1. Here's our certificate chain, please trust someone in this chain - OR -
      2. Please supply me with the necessary client certificate and the instructions for the obtaining one (the administrative hoops that I have to jump through to get them).
The problem with this are:
  1. The tools are complicated and require either good training or documentation for folks who haven't needed to learn to use them.
  2. You actually have to understand how the Internet standards work to determine what tool to use and when.
  3. I don't scale.  We need me in a box (or more specifically, in a service).
So, I've resolved to work on a tool to do this sort of testing for me, with the details about what is wrong written in very user friendly language about what it finds about the problem.  It's an interesting bit of interface engineering.