Monthly Archives: December 2014

Question about storing/processing Coded values in CDA document

Question
If a cda code is represented with both a code and a translation, which of the following should be imported as the stored code into CDR:
  1. a normalised version of the Translation element (using the clinical terminology service)
  2. the Code element exactly as it is in the cda document
The argument for option 1 is that since the Clinical Terminology Service is the authoritative source of code translations, we do not need to pay any attention to a ‘code’ field, even though the authoring system has performed the translation themselves (which may or may not be correct). The argument for option 2 is that Clinicians sign off on the code and translation fields provided in a document.  Ignoring the code field could potentially modify the intended meaning of the data being provided.
Answer
First, to clarify several assumptions:
“Clinicians sign off on the code and translation fields provided in a document”
Clinicians actually sign off on the narrative, not the data. Exactly how a code is represented in the data – translations or whatever – is not important like maintaining the narrative. Obviously there should be some relationship in the document, but exactly what that is is not obvious. And exactly what needs to be done with the document afterwards is even less obvious.
The second assumption concerns which is the ‘original’ code, and which is the ‘translation’. There’s actually a number of options:
  • The user picked code A, and some terminology server performed a translation to code B
    • The user picked code A, and some terminology server in a middle ware engine performed a translation to code B
  • The user picked code X, and some terminology server performed translations to both code A and B
  • The user was in some workflow, and this as manually associated with codes A and B in the source system configuration
  • The user used some words, and a language processor determined codes A and B
    • The user used some words, and two different language processors determined codes A and B

ok, the last is getting a bit fanciful – I doubt there’s one workable language processor out there, but there are definitely a bunch out there being evaluated. Anyway, the point is, the relationship between code A and code B isn’t automatically that one is translated from the other. The language in the data types specification is a little loose:

A CD represents any kind of concept usually by giving a code defined in a code system. A CD can contain the original text or phrase that served as the basis of the coding and one or more translations into different coding systems

It’s loose because it’s not exactly clear what the translations are of:

  • “a code in defined in a code system”
  • “the original text”
  • the concept

The correct answer is the last – each code, and the text, are all representations of the concept. So the different codes may capture different nuances, and it may not be possible to prove that the translation between the two codes is valid.

Finally, either code A or code B might be the root, and the other the translation. The specification says 2 different things about which is root: the original one (if you know which it is), or the one that meets the conformance rule (e.g. if the IG says you have to use SNOMED CT, then you put that in the root, and put the other in the translation, irrespective of the relationship between them).

Actually, which people do depends on what their trading partner does. One major system that runs several important CDRs ignores the translations altogether….

Turning to the actual question: what should a CDR do?

I think that depends on who’s going to be consuming / processing the data. If the CDR is an analysis end point – e.g. data comes in, and analysis reports come out, and also if the use cases are closed, then you could be safe to mine the CD looking for the code your terminology server understands, and just store that as a reference.

But if the use cases aren’t closed, so that it turns out that a particular analysis would be better performed against a different code system, then it would turn out that storing just the one understood reference would be rather costly. A great case is lab data that is coded by both LOINC and SNOMED CT – each of those serves different purposes.

This some applies if other systems are expected to access the data to do their own analysis – they’ll be hamstrung without the full source codes from the original document.

So unless your CDR is a closed and sealed box – and I don’t believe such a thing exists at design time – it’s really rather a good idea to store the Code element exactly as it is in the CDA document (and if it references that narrative text for the originalText, make sure you store that too)

 

 

A JSON representation for HL7 v2?

Several weeks ago, I was in Amsterdam for the Furore FHIR DevDays. While there, Nikolay from Health Samurai showed off a neat javascript based framework for sharing scripts that convert from HL7 v2 to FHIR.

Sharing these scripts, however, requires a standard JSON representation for HL7 v2 messages, and that turns out to have it’s challenges. Let’s start with what looks like a nice simple representation:

{
 "MSH" : ["|", null, "HC", "HC1456", "ATI", "ATI1001", "200209100000",
    null, null, [ "ACK", "ACK" ], "11037", "P", "2.4"],
 "MSA" : [ "AA", "345345" ]
}

This is a pretty natural way to represent a version 2 message in JSON, but it has a number of deficiencies. The first is that a message can contain more than one segment of the same type, and JSON property names must be unique (actually, JSON doesn’t explicitly say this, but Tim Bray’s clarification does). So the first thing we need to do is make the segments an array:

{
 "v2" : [
  [ "MSH", "|", null, "HC", "HC1456", "ATI", "ATI1001", "200209100000",
     null, null, [ "ACK", "ACK" ], "11037", "P", "2.4"],
  [ "MSA", "AA", "345345" ]
 ]
}

This format – where the segment code is item 0 in the array of values that represent the segment – has the useful property that field “1” in the HL7 definitions becomes item 1 in the array.

Btw, alert readers will note that the { “v2”: } part is pure syntax, and could potentially be dropped, but my experience is that many JSON parsers can only accept an object, not an array (arrays must be properties of objects), so we really should have an object wrapper. At the DevDays, we discussed pulling out some data from the MSH, and making it explicit:

{
 "event" : "ACK",
 "msg" : "ACK",
 "structure" : "ACK",
 "segments" : [
   ...
 ]
}

I’m not sure whether that’s justified or not. The information is in the MSH segments, so it’s straight duplication.

Problems

However this nice simple to grasp format turns out to be relatively unstable – the actual way that an item is represented depends on the values around it, and so scripts won’t be shareable across different implementations. As an example, take the representation of MSH-3, of type HD (ST from 2.1 to 2.2). In the example above, it’s represented as “HC” – just a simple string, to correspond to |HC|. If, however, the source message uses one of the other components from the HD data type, then it would change to a JSON representation of e.g. |^HC^L|, to, say:

 { "Universal ID" : "HC", "Universal ID Type" : "L" }

So the first problem is that whether or not subsequent components appear changes the representation of the first component. Note that this is an ambiguity built into v2 itself, and is handled in various different ways by the many existing HL7 v2 libraries. The second problem with this particular format is that the names given to the fields have varied across the different versions of HL7 v2, as they have never been regarded as signficant. Universal ID is known as “universal ID” from v2.3 to 2.4 – other fields have much more variation than that. So it’s better to avoid names altogether, especially since implementers regularly just use additional components that are not yet defined:

 { "2" : "HC", "3" : "L" }

but if all we’re going to do is have index values, then let’s just use an array:

 [ null, "HC", "L" ]

Though this does have the problem that component 2 is element 01 We could fix that with this representation:

 [ "HD", null, "HC", "L" ]

where the first item in the array has it’s type; this would be variable across versions, and could be omitted (e.g. replaced with null) – I’m not sure whether that’s a value addition or not. Below, I’m not going to add the type to offset the items in the array, but it’s still an option.

The general structure for a version 2 message (or batch) is:

  • A list of segments.
  • Each segment has a code, and a number of data elements
  • Each data element can occur more than once

then:

  • Each Data element has a type, which is either a simple text value, or one or more optional components
  • Each component has a type, which is either a simple text value, or one or more optional sub-components
  • Each subcomponent has a text value of some type

or:

  • Each Data element has one or more components
  • Each component has one or more subcomponents
  • Each subcomponent has a text value of some type

Aside: where’s the abstract message syntax? Well, we tried to introduce it into the wire format in v2.xml – this was problematic for several reasons (names vary, people don’t follow the structure, the structures are ambiguous in some commonly used versions, and most of all, injecting the names into the wire format was hard), and it didn’t actually give you much validation, which was the original intent, since people don’t always follow them. That’s why it’s called “abstract message syntax”. Here, we’re dealing with concrete message syntax.

The first is what the specification describes, but the wire format hides the difference between the various forms, and you can only tell them apart if you have access to the definitions. The problem is, often you don’t, since the formats are often extended informally or formally, and implementers make a mess of this across versions. And this practice is fostered by the way the HL7 committees change things. I’ve found, after much experimentation, that the best way to handle this is to hide the difference behind an API – then it doesn’t matter. But we don’t have an API to hide our JSON representation behind, and therefore we have to decide.

That gives us a poisoned chalice: we can decide for a more rigorous format that follows my second list. This makes for more complicated conversion scripts that get written against the wire format, and are much more re-usable, or we can decide for a less rigorous format that’s easier to work with, that follows the v2 definitions more naturally, but that is less robust and less re-useable.

Option #1: Rigor

In this option, there’s an array for every level, and following the second list:

  1. Array for segments
  2. Array for Data Elements
  3. Array for repeats
  4. Array for components
  5. Array for sub-components

And our example message looks like this:

{
 "v2" : [
  [ [[["MSH"]]], [[["|"]]], null, [[["HC"]]], [[["HC1456"]]], [[["ATI"]]], 
    [[["ATI1001"]]], [[["200209100000"]]], null, null, [[["ACK"], ["ACK"]]], 
    [[["11037"]]], [[["P"]]], [[["2.4"]]] ],
  [ [[["MSA"]]], [[["AA"]]], [[["345345"]]] ]
}

This doesn’t look nice, and writing accessors for data values means accessing at the sub-component level always, which would be a chore, but it would be very robust across implementations and versions. I’m not sure how to evaluate whether that’s worthwhile – mostly, but not always, it’s safe to ignore additional components that are added across versions, or in informal extensions.

Option 2: Simplicity

In this option, there’s a choice of string or array:

  1. Array for segments
  2. Array for Data Elements
  3. Array for repeats
  4. String or Array for components
  5. String or Array for sub-components

And our example message looks like this:

{
 "v2" : [
  [ "MSH", ["|"], null, ["HC"], ["HC1456"], ["ATI"], ["ATI1001"], ["200209100000"],
     null, null, [[ "ACK", "ACK" ]], ["11037"], ["P"], ["2.4"]],
  [ "MSA", ["AA"], ["345345"] ]
}

The annoying thing here is that we haven’t achieved the simplicity that we really wanted (what we had at the top) because of repeating fields. I can’t figure out a way to remove that layer without introducing an object (more complexity), or introducing ambiguity.

Summary

Which is better? That depends, and I don’t know how to choose. For the FHIR purpose, I think that the robust format is probably better, because it would allow for more sharing of conversion scripts. But for other users, the simpler format might be more appropriate.

p.s. Nikolay watched the discussion between James Agnew and myself on this with growing consternation, and decided to cater for multiple JSON formats. That’s probably a worse outcome, but I could understand his thinking.

 

 

Question: Using FHIR for systems integration

Question:

Is there information (i.e., FHIR standards, whitepapers etc.) discussing FHIR from a systems integration perspective? For example, have approaches been discussed on how to implement FHIR to consolidate and integrate information from multiple backend legacy (i.e., non-FHIR) systems then forward the information bundles as FHIR REST services? Have any middleware approaches (e.g, ESB, message buses, data tools) been discussed? The integration may also have “Governance” ramifications because the integration would want to prevent direct access to backend systems

Answer:

Well, this is certainly a core use case for FHIR, and we’ve had various participants to many connectathons trying these kind of scenarios out.

There’s some information published. In the specification itself, there’s Managing Resource Identity, and Push vs Pull. Here’s a list of blog links I could find:

There’s also some good information about this in the HL7 help desk (Under “FHIR Architecture”). Note that this material is HL7 members only

 

#FHIR Updates

A round of links and news about progess with FHIR

Draft For Comment Posted

I have just finished posting the final version on which the current draft for comment is based: http://hl7.org/implement/standards/FHIR-Develop/.

This version of FHIR is our first serious look at what we plan to release as DSTU2 for FHIR. From here, this candidate will undergo a round of comment and testing, including the HL7 “draft for comment”, where HL7 members can comment on the content of the ballot, and also will be tested through several connectathons and other implementation projects. Following that, we will gather all the feedback, and prepare the second candidate, which will be published around the start of April. This will get another cycle of testing, and then we’ll make further changes in response. We’re planning to publish the final version of DSTU 2 around the end of June.

DSTU 2 is clearly going to be a landmark release for FHIR; it will be the first full version that has relatively complete coverage of the healthcare space, and I know that a number of large vendor consortiums, national programs and standards projects are planning on using it for real. Our current plan is that the next version after that will be a full normative ballot. Given the amount of interest FHIR has attracted, and the size of the implementation pool FHIR will have, we expect getting full consensus for the first normative version to be a slow process.

So what I’m saying is that any work people put into reviewing this version of FHIR will be time well invested.

Btw, there are 3 main versions of FHIR posted now:

  1. http://hl7.org/implement/standards/fhir/ – DSTU1, the current version of the DSTU
  2. http://hl7.org/implement/standards/FHIR-Develop/ – the draft for comment source, which is also the stable version for connectathons from Jan – March
  3. http://hl7-fhir.github.io/ – the rolling current version; it runs about 20 minutes behind version control

Project Argonaut

If you’re interested in FHIR, but been living under a rock, you may not have heard about Project Argonaut. The original press release caused a great deal of excitement. My own comments, for the HL7 worker community, got posted to the public at least here and here. I’ll post more information here or elsewhere as it’s available.

Project Argonaut represents a significant validation of the FHIR project, and I thank the leaders of that group (particularly John Halamka, Aneesh Chopra, Arien Malec, Micky Tripathi, and most of all, Dave McCallie) for their willingness to stick their neck out and also – we definitely appreciate this bit – contribute real resources to our project. This kind of validation has certainly made everyone sit up and take notice, and it seems likely that the FHIR community will grow some more in both breadth and depth in the next few months.

FHIR for executives

Rene Spronk has prepared a very useful high level summary of FHIR for executives. (now all we need to do is complete the “FHIR for Clinicians” document – it’s in progress).

 

 

 

FHIR and Healthcare Informatics Education

One of the interesting things about FHIR is how it offers new prospects for real practical hands-on education.

This is about much more than that it’s much easier and more accessible than other health informatics standards. These are the reasons why:

  • the technology base of the implementation is much more open (browsers, etc)
  • there’s a great abundance of open source tools
  • the community’s focus on examples means that there’s already lots of examples
  • the general focus on patient access to data will mean that students are much more easily able to get access to real data (their own, and others – by permission, of course)

But so far, this has remained just a prospect.

Well, not any more.

Last week, at Furore DevDays, I met Simone Heckmann, from Heilbronn, who’s doing a bunch of real interesting stuff with her students. Ewout described it here:

Simone uses FHIR to teach her students about interoperability and show them the caveats and real-life problems involved in building connected systems. And that’s only part of her teaching curriculum; in addition to have them map one type of messages to another standard, she also asks her students to select any of the available open-source FHIR clients and servers, play with them for about a month and extend them. And this is just the prelude to the final part of the teaching program: she then organizes a hackathon at the Hochschule where the students bring their pet projects they have been working on and test them against each other

This is really cool – my immediate response was the same as Ewout’s: “I want to be a student at Heilbronn“. This is really teaching students something – real experience at getting information to flow between systems.

Simone told me that she’s planning to post her course notes etc on the web. I’ll be sure to post a link to them when she does. I’m impressed – this is exactly the kind of can-do practical work that FHIR is all about.

And if that’s not enough, she’s got a FHIR blog too (it’s in German – this is the English translation) .

Welcome, Simone, to the FHIR community.

 

Question: Clinical Documents in FHIR

Question:

Nice to see so many articles on FHIR to understand this new technology for interoperability. I have a very basic and silly question.
I want to understand i would like to transfer whole Clinical Document from one client to another using FHIR. Since in FHIR everything is being referred to Resource i am not able to find out the relation among them so that i can store it as a single record. and what is the mechanism to update individual Resources inside the Record. If possible please share some sample and case study.
Thanks a ton in advance.

Answer:

There’s multiple things that “transfer” could mean. If you mean “transfer” in the XDS sense, of establishing a common registry/repository of documents, then you want DocumentReference (see the XDS Profile). This is the technical basis of the forthcoming MHD profile from IHE.

If you mean, some kind of “message” in a v2 push based sense, there isn’t a predefined approach, since no one has been asking for this.