Wednesday, November 10, 2010

A best practice for data element descriptions

Once again someone has given me a spreadsheet with two and three word phrases to explain a set of data elements (more than 400 this time).  Yet there is no explanation of each of these items in lay terms.  I can understand them in a general way.  But the challenge for me is when the terms have specific meanings according to the way care is practiced at the institution.  It requires a great deal of back and forth to refine these definitions in ways that implementers will understand.

The C154 HITSP Data Dictionary document executes what I consider to be best practice in this arena.  Each row in the data dictionary includes four items, which are described below1:
    Each data element has an identifier that uniquely identifies it. The first part of the identifier is assigned based upon the module where it is found. The second part of the identifier uniquely identifies the element within the module. As new data elements are created, they are added to the end of the data module. The data element identifiers are persistent and will not be changed or reused between versions of HITSP specifications.
    Each data element has a name that briefly describes the content and purpose of the data element. Data element names may be changed between versions of HITSP specifications to better describe the content and purpose.
    Each HITSP data element has a definition that is intended to precisely describe the purpose and structure of the data element independent from the standards that it may be mapped to. This independence allows HITSP data elements to be mapped to data elements using a variety of standards. The concise definition and mapping to the standards data element also supports harmonization of data across exchanges using different standards. The definition should describe the data element with sufficient enough detail to clearly indicate the purpose and content of the data element.
    In some cases, the data element will have additional restrictions limiting the values that can be communicated within it. HITSP may apply restrictions to a data element when it is communicated. These restrictions could be with regard to its precision, the units, and the range of legal values that may be transmitted or other restrictions as necessary. These will be described in or referred to by this column.

    This column defines universal constraints that apply to data elements regardless of the Base Standard (i.e. HL7 CDA, HL7 V2 messages, NCPDP, etc.) allowing for harmonized constraints across the various Base Standards. Additional data element constraints may also be defined in the CDA-specific sections of this document (2.2.x), or in HITSP Components, Transactions, Transaction Packages or Interoperability Specifications.
I'm sending the spreadsheet back to ask for column 3 and where necessary column 4 to be completed.  I hope that eventually this becomes a recognized practice in the industry.

  -- Keith

1 © 2010 ANSI. This material may be copied without permission from ANSI only if and to the extent that the text is not altered in any fashion and ANSI's copyright is clearly noted.


  1. One thing to watch out for is that the semantics of concepts in controlled terminologies is very much dependent on the assumptions of the problem space as understood by the terminology developer. Thus LOINC shoehorns everything into its six axes; SNOMED has its concept hierarchies.

    In your case, the concepts are driven by the specific structural context - the particular module. In fact, the semantics of the data element is only formally valid in that context, as recognized by the ID being tied to the module. This poses two challenges.

    1 - The mapping of the data element to a reference vocabulary concept (e.g., from SNOMED or LOINC) may not be precise. For instance, we might map a data element value for the site of a measurement to a SNOMED term for a disease, with the intented meaning of the site of the disease (e.g., "the volume of the [cardiac effusion]"). In the context of a particular module construct we often do these 'type castings', and in fact that is part of the local semantics of the data element.

    2 - Because each data element is only locally defined, it is hard to connect to similar or identical data elements that happen to be defined in other modules, except that they both map to the same reference vocabulary concept (maybe - see item 1).

  2. Hi,

    Can help me understand DHIS 2 ( in a hospital mode. We have got Oracle database server, how would link dhis2 directly to db.


    Chandrakant B