FHIR DSTU2 is published

The FHIR team is pleased to announce that FHIR DSTU is now published at http://hl7.org/fhir. The 2nd DSTU is an extensive rewrite of all parts of the specification. Some of the highlights this version accomplishes:

  • Simplifies the RESTful API
  • Extends search and versioning significantly
  • Increases the power and reach of the conformance resources and tools
  • Defines a terminology service
  • Broadens functionality to cover new clinical, administrative and financial areas
  • Incorporates thousands of changes in existing areas in response to trial use

As part of publishing this version, we have invested heavily in the quality of the process and the specification, and the overall consistency is much improved. A full list of changes to the FHIR standard can be found at http://hl7.org/fhir/history.html#history.

In addition, DSTU2 is published along with several US-realm specific implementations developed in association with the ONC: DAF, SDC, and QICore.

This release has involved an astounding amount of work from the editorial team, which, in addition to me, includes:

  • Lloyd McKenzie
  • Brian Postlethwaite
  • Eric Haas
  • Jason Matthews
  • Mark Kramer
  • Paul Knapp
  • Brett Marquard
  • Ewout Kramer
  • Richard Etterna
  • Claude Nanjo
  • James Agnew
  • Josh Mandel
  • John Moerhke
  • Nagesh (Dragon) Bashyam
  • Alexander Henket
  • Chris Moesel
  • Marc Hadley
  • Rob Hausam
  • Bryn Rhodes
  • Nathan Davis
  • Jason Walonoski
  • Rik Smithies
  • Molly Ullman-Cullere
  • Chris Nickerson
  • Jean Duteau
  • Chi Tran
  • David Hay
  • Tom Lukasik
  • Hugh Glover
  • Chris Millet
  • Fabien Laurent
  • Marla Albitiz
  • Richard Kavanagh
  • Brad Arndt
  • Brett Esler
  • Chris White
  • Jay Lyle
  • Eric Larson
  • Lorraine Constable
  • Ken Rubin

In addition, to this, the HL7 leadership, the wider HL7 community and the wider FHIR Adoption community have all made significant and massive contributions. Additional contributers are recognised in the specification.

Note: there is still much to be done; this is the first full DSTU release, and it will get a lot of use. I’ll make a series of follow up posts describing some of the significant aspects of this release, and our overall plans going forward, over the next couple of weeks


What is the state of CDA R3?


We have seen references to new extension methodologies being proposed in CDA R3; however I can’t seem to find what the current state of CDA R3 is.  Most searches return old results.  The most recent document related to CDA R3 using FHIR instead of RIM.  What is the current state of CDA R3 and where can I find more information.  HL7 pages seem to be pretty old.


The Structured Documents work group at HL7 (the community that maintains the CDA standard), is currently focused on publishing a backwards compatible update to CDA R2 called CDA R2.1. CDA R3 work has been deferred to the future, both in order to allow the community to focus on R 2.1, and to allow FHIR time to mature.

There is general informal agreement that CDA R3 will be FHIR based, but I wouldn’t regard that as formal or final; FHIR has to demonstrate value in some areas that it hasn’t yet done before a final decision could be made. I expect that we’ll have more discussions about this at the HL7 working meeting in Atlanta in October.

Question: #FHIR and patient generated data


With the increase in device usage and general consumer-centric health sites (e.g. myfitnesspal, Healthvault, Sharecare) coupled with the adoption of FHIR, it seems like it is becoming more and more common for a consumer to be able to provide the ability to share their data with a health system. The question I have lies in the intersection of self-reported data and the clinical record.

How are health systems and vendors handling the exchange (really, ingest) of self-reported data?

An easy example is something along the lines of I report my height as 5’10” and my weight as 175 in MyFitnessPal and I now want to share all my diet and bio  data with my provider.  What happens to the height and weight?  Does it get stored as a note?  As some other data point?  Obviously, with FHIR, the standard for transferring become easier, however, I’m curious what it looks like on the receiving end. A more complicated example might the usage of codifying an intake form.  How would i take a data value like “do you smoke” and incorporate that into the EHR?  Does it get stored in the actual clinical record or again, as a note?  If not in the clinical system, how do I report (a la MU) on this data point.


Well, FHIR enables this kind of exchange, but as you say, what’s actually happening in this regard? Well, as you say, it’s more a policy / procedure question, so I have no idea (though the draft MU Stage 3 rule would give credit for this as data from so-called “non-clinical” sources). But what I can do is ask the experts – leads on both the vendor and provider side. So that’s what I did, and here’s some of their comments.

From a major vendor integration lead:

For us, at least, the simplest answer is the always-satisfying “it’s complicated.”

Data generally falls into one of the following buckets:

  1. Data that requires no validation: data that is subjective. PHQ-2/9.
  2. Data that requires no validation (2): data that comes directly from devices/Healthkit/Google Fit.
  3. Data that requires minimal validation: data that is mostly subjective but that a clinician might want to validate that the patient understood the scope of the question – ADLs, pain score, family history, HPI, etc.
  4. Data that requires validation: typically, allergies/problems/meds/immunizations; that is, data that contributes to decision support and/or the physician-authored medical record.
  5. Data that is purely informational and that is not stored discretely.

Depending on what we are capturing, there are different confirmation paths.

Something like height and weight would likely file into (e). Height and weight are (a) already captured as a part of typical OP flow and (b) crucially important to patient safety (weight-based dosing), so it’s unlikely that a physician would favor a patient-reported height/weight over a clinic-recorded value.

That said, a patient with CHF who reports a weight gain > 2lb overnight will likely trigger an alert, and the trend remains important. But the patient-reported value is unlikely to replace a clinic-recorded value.

John Halamka contributed this BIDMC Patient Data Recommendation, along with a presentation explaining it. Here’s a very brief extract:

Purpose: To define and provide a process to incorporate Patient Generated Health Data into clinical practice

Clinicians may use PGHD to guide treatment decisions, similarly to how they would use data collected and recorded in traditional clinical settings. Judgment should be exercised when electronic data from consumer health technologies are discordant with other data

Thanks John – this is exactly the kind of information that is good to share widely.

Question: Solutions for synchronization between multiple HL7-repositories?


In the area of using HL7 for patient record storage, there are use cases to involve various sources of patient information who are involved in the care for one patient. For these people, we need to be able to offer a synchronization between multiple HL7-repositories. Are there any implementations of a synchronization engine between HL7 repositories?


There is no single product that provides a solution like this. Typically, a working solution like this involves a great deal of custom business logic, and such solutions are usually solved using a mixture of interface engines, scripting, and bespoke code and services developed in some programming language of choice. See Why use an interface engine?

This is a common problem that has been solved more than once in a variety of ways with a myriad of products.

Here’s an overview of the challenge:

If by synchronization we mean just “replication” from A to B, then A needs to be able to send and B needs to receive messages or service calls. If by synchronization we mean two-way “symmetric” synchronization then you have to add logic to prevent “‘rattling” (where the same event gets triggered back and forth). An integration engine can provide the transformations between DB records and messages, but in general the concept codes and identifiers must still be reconciled between the systems.

For codes, an “interlingua” like SNOMED, LOINC, etc. is helpful if one or both of the systems uses local codes. The participants may implement translations (lookups) to map to the other participant or to the interlingua (it acts as the mediating correlator) The interface engine can call services, or perform the needed lookups. “Semantic” mapping incorporates extra logic for mapping concepts that are divided into their aspects (like LOINC, body system, substance, property, units, etc. Naturally if all participants actually support the interlingua natively the problem goes away. For identifiers, a correlating EMPI at each end can find-or-register patients based on matching rules. If a simplistic matching rule is sufficient and the receiving repository is just a database, then the integration engine alone could map the incoming demographic profile to a query against the patients table and look up the target patient – and add one if it’s new.

But if the target repository has numerous patients, with probabilistic matching rules (to maximize the rate of unattended matches, i.e. not bringing a human registrar into the loop to do merges), then the receiving system should implement a service of some kind (using HL7/OMG IXS standard, OMG PIDS (ref?), or FHIR), and the integration engine can translate the incoming demographic into a find-or-register call to that service. Such a project will of course require some analysis and configuration, but with most interface engines, there will be no need for conventional programming. Rather, you have (or make) trees that describe the message segments, tables, or service calls, and then you map (drag/drop) the corresponding elements from sources to targets.

An MDM or EMPI product worth its salt will implement a probabilistic matching engine and implement a web-callable interface (SOAP or REST) as described. If the participants are organizationally inside the same larger entity (a provider health system), then the larger organization may implement a mediating correlator just like the interlingua for terminology. The “correlating” EMPI assigns master identifiers in response to incoming feeds (carrying local ids) from source systems; Then that EMPI can service “get corresponding ids” requests to support the scenario you describe. An even tighter integration results if one or both participants actually uses that “master” id domain as its patient identifiers.

Here’s some example projects along these lines:

  • dbMotion created a solution that would allow a clinical workstation to access information about a common patient from multiple independent EMRs. It accomplished this by placing an adapter on top of EHR that exposed its data content in a common format (based upon the RIM) that their workstation application was able to query and merge the patient data from all the EMR into a single desktop view. The actual data in the source EHR were never modified in any way. This was implemented in Israel and then replicated in the US one RHIO at a time. (Note: dbMotion has since been acquired by Allscripts)
  • California State Immunization created a solution that facilitated synchronization of patient immunization history across the nine different immunization registries operating within the state. The solution was based upon a family of HL7 v2 messages that enabled each registry to request patient detail from another and use the query result to update its own record. This solution was eventually replaced by converting all the registries to a common technical platform and then creating a central instance of the system that served all of the regional registries in common (so synchronization was no longer an issue now that there was a single database of record, which is much simpler to maintain).
  • LA County IDR is an architecture put in place in Los Angles County to integrate data from the 19+ public health information system both as a means of creating a master database that could be used for synchronization and could be used as a single source to feed data analytics. The Integrated Data Repository was built using a design that was first envisioned as part of the CDC PHIN project. The IDR is a component of the CDC’s National Electronic Disease Surveillance System (NEDSS) implemented in at least 16 state health departments.

The following people helped with this answer: Dave Shaver, Abdul Malik Shakir, Jon Farmer

Profiles and Exceptions to the Rules

One of the key constructs in FHIR is a “profile”. A profile is a statement of how FHIR resources are used for a particular solution – or, how they should be used. The FHIR resources are a general purpose construct, and you can do kind of general purpose things with them, such as store the data in a PHR, and do generally useful display of a clinical record etc.

But if you’re going to do something more specific, then you need to be specific about the contents. Perhaps, for instance, you’re going to write a decision support module that takes in ongoing glucose and HBA1c measurements, and keeps the patient informed about how well they are controlling their diabetes. In order for a patient or an institution to use that decision support module well, the author of the module is going to have to be clear about what are acceptable input measurements – and it’s very likely, unfortunately, that the answer is ‘not all of them’. Conversely, if the clinical record system is going to allow it’s users to hook up decision support modules like this, it’s going to have to be clear about what kind of glucose measurements it might feed to the decision support system.

If both the decision support system and the clinical records system produce profiles, a system administrator might even able to get an automated comparison to see whether they’re compatible. At least, that’s where we’d like to end up.

For now, however, let’s just consider the rules themselves. A clinical record system might find itself in this situation:

  • We can provide a stream of glucose measurements to the decision support system
  • They’ll come from several sources – labs, point of care testing devices, inpatient monitoring systems, and wearables
  • There’s usually one or more intermediary systems between the actual glucose measurement, and the clinical record system (diagnostic systems, bedside care systems, home health systems – this is a rapidly changing space)
  • Each measurement will have one of a few LOINC codes (say, 39480-9: Glucose [Moles/volume] in Venous blood, 41652-9: Glucose [Mass/volume] in Venous blood,
    14743-9: Glucose [Moles/volume] in Capillary blood by Glucometer)
  • the units of measure will be mg/dL or mmol/L
  • there’ll be a numerical value, perhaps with a greater than or less than comparator (e.g. >45mmol/L)

So you can prepare a FHIR profile that says this one way or another. And then a decision support engine can have a feel for what kind of data it might get, and make sure it can handle it all appropriately.

So that’s all fine. But…

Eventually, the integration engineers that actually bring the data into the system discover – by looking at rejected messages (usually) – 1 in a million inbound glucose measurements from the lab contain a text message instead of a numerical value. The message might be “Glucose value to high to determine”. Now what? From a clinical safety perspective, it’s almost certain that the integration engineers won’t replace “too high to determine’ with a “>N” where N is some arbitrarily chosen number – there’s no number they can choose that isn’t wrong. And they won’t be able to get the source system to change their interface either – that would have other knock-on effects for other customers / partners of the source system. Nor can they drop the data from the clinical record – it’s the actual test result. So they’ll find a way to inject that value into the system.

Btw- aside – some of the things that go in this string value could go in Observation.dataAbsentReason, but they’re not coded, and it’s not possible to confidently decide which are missing reasons, and which are ‘text values’. So dataAbsentReason isn’t a solution to this case, though it’s always relevant.

Now the system contains data that doesn’t conform to the profile it claimed to use. What should happen?

  1. The system hides the data and doesn’t let the decision support system see it
  2. The system changes it’s profile to say that it might also send text instead of a number
  3. The system exposes the non-conformant data to the decision support system, but flags that it’s not valid according to it’s own declarations

Neither of these is palatable. I assume that #1 isn’t possible, at least, not as a blanket policy. There’s going to be some clinical safety reason why the value has to be passed on, just the same as the integration engineers passed it on in the first place, so that there’re not liable.

Option #2 is a good system/programmer choice – just tell me what you’re going to do, and don’t beat around the bush. And the system can do this – it can revise the statement ‘there’ll be a numerical value’ to something like ‘there’ll be a numerical value, or some text’. At least this is clear.

Only it creates a problem – now, the consumer of the data knows that they might get a number, or a string. But why might the get a string? what does it mean? Someone does know, somewhere, that the string option is used 1 in a million times, but there’s no way (currently, at least) to say this in the profile – it just says what’s possible, not what’s good, or ideal, or common. If you start considering the impact of data quality on every element – which you’re going to have to do – then you’re going to end up with a profile that’s technically correct but quite non-comunicative about what the data might be, nor one that provides any guidance as to what it should be, so that implementers know what they should do. (and observationally, if you say that it can be a string, then, hey, that’s what the integration engineers will do to, because it’s quicker….)

That’s what leads to the question about option #3: maybe the best thing to do is to leave the profile saying what’s ideal, what’s intended, and let systems flag non-conforming resources with a tag, or wrong elements with an extension? Then the consumer of the information can always check, and ignore it if they want to.

That is, if they know about the flag, and remember. Which means we’d need to define it globally, and the standard itself would have to tell people to check for data that isn’t consistent with it’s claims… and then we’d have to add overrides to say that some rules actually mean what they say, as opposed to not actually meaning that…. it all sounds really messy to me.

Perhaps, the right way to handle this is to have ideal and actual profiles? That would mean an extension to the Conformance resource so you could specify both – but already the interplay between system and use case profiles is not well understood.

I think this area needs further research.

p.s. There’s more than some passing similarity between this case and the game of ‘hot potato‘ I used to play as a kid: ‘who’s going to do have to do something about this bad data’.

#FHIR Report from the Paris Working Meeting

I’m on the way home from HL7’s 2015 May Working Group Meeting. This meeting was held in Paris. Well, not quite Paris – at the Hyatt Regency at Charles De Gaulle Airport.


A sad and quite unexpected event occurred at this meeting – Helmut Koenig passed away. Helmut Koenig was a friend who had attended HL7 and DICOM meetings for many years. Recently, he had contributed to the DICOM related resources, including ImagingStudy and ImagingObjectSelection resources.

Helmut actually passed away at the meeting itself, and we worked on resolving his ballot comments the next day. Links:

Ballot Summary

The FHIR community continues to grow in leaps and bounds. That was reflected in the FHIR ballot: we had strong participation and many detailed comments about the specification itself. Once all the ballot comments had been processed and duplicates removed, and line items traded amongst the various FHIR related specifications, the core specification had 1137 line items for committees to handle. You can see them for yourself on HL7’s gForge.

This is a huge task and will be the main focus of the FHIR community for the next couple of months as we grind towards publication of the second DSTU. At the meeting itself, we disposed of around 100 line items; I thought this was excellent work since we focused on the hardest and most controversial ones.


We had about 70 participants for the connectathon. Implementers focused on the main streams of the connectathon: basic Patient handling, HL7 v2 to FHIR conversion, Terminology Services, and claiming. For me, the key outcomes of the connectathon were:

  • We got further feedback about the quality of specification, with ideas for improvement
  • Many of the connectathon participants stayed on and contributed to ballot reconciliation through the week.

The connectathons are a key foundation of the FHIR Community – they keep us focused on making FHIR something that is practical and implementer focused.

We have many connectathons planned through the rest of this year (at least 6, and more are being considered). I’ll announce them here as the opportunity arises.


Another pillar of the FHIR Community is our collaborations with other health data exchange communities. In addition to our many existing collaborations, this meeting the FHIR core team met with Continua, the oneM2M alliance, and the IHE test tools team. (We already have a strong collaboration with IHE generally, so this is just an extension of this in a specific area of focus).

With IHE, we plan to have a ‘conformance test tools’ stream at the Atlanta connectathon, which will test the proposed (though not yet approved) TestScript resource, which is a joint development effort between Mitre, Aegis, and the core team. We expect that the collaboration with Continua will lead to a joint connectathon testing draft FHIR based Continua specifications later this year. Working with oneM2M will involve architectural and infrastructural development, and this will take longer to come to fruition.

FHIR Infrastructure

At this meeting, the HL7 internal processes approved the creation of a “FHIR Infrastructure” Work group. This work group will be responsible for the core FHIR infrastructure – base documentation, the API, the data types, and a number of the infrastructure resources. The FHIR infrastructure group has a long list of collaborations with other HL7 work groups such as Implementation Technology, Conformance and Implementation, Structured Documents, Modelling and Methodology, and many more. This just regularises the existing processes in HL7; it doesn’t signal anything new in terms of development of FHIR.

FHIR Maturity model

One of the very evident features of the FHIR specification as it stands today is that the content in it has a range of levels of readiness for implementation. Implementers often ask about this – how ready is the content for use?

We have a range – Patient, for instance, has been widely tested, including several production implementations. While the content might still change further in response to implementer experience, we know that what’s there is suitable for production implementation. On the other hand, other resources are relatively newly defined, and haven’t been tested at all. This will continue to be true, as we introduce new functionality into the specification; some – a gradually increasing amount – will be ready for production implementation, while new things will take a number of cycles to mature.

In response to this, we are going to introduce a FHIR Maturity model grading based on the well known CMM index. All resources and profiles that are published as part of the FHIR specification will have a FMM grade to help implementers understand where content is.

FHIR & Semantic Exchange

I still get comments from some parts of the HL7 community about FHIR and the fact that it is not properly based on real semantic exchange. I think this is largely a misunderstanding; it’s made for 2 main reasons:

  • The RIM mappings are largely in the background
  • We do not impose requirements to handle data properly

It’s true that we don’t force applications to handle data properly. I’d certainly like them to, but we can’t force them to, and one of the big lessons from V3 development was that we can’t, one way or another, achieve that. Implementers do generally want to improve their data handling, but they’re heavily constrained by real world constraints, including cost of development, legacy data, and that the paying users (often) don’t care.

And it’s true that the RIM mappings have proven largely of theoretical value; we’ve only had one ballot comment about RIM mappings, and very few people have contributed to them.

What we do instead, is insist that the infrastructure is computable; of all HL7 specifications, only FHIR consistently has all the value sets defined and published. Anyone who’s done CCDA implementation will know how significant this is.

Still, we have a long way to go yet. A key part of our work in this area is the development of RDF representations for FHIR resources, and the underlying definitions, including the reference models, and we’ll be putting a lot of work into binding to terminologies such as LOINC, SNOMED CT and others.

There’s some confusion about this: we’re not defining RDF representations of resources because we think this is relevant to typical operational exchange of healthcare data; XML and JSON cover this area perfectly well. Where RDF representations will be interface between operational healthcare data exchange and analysis and reasoning tools. Such tools will have applications in primary healthcare and secondary data usage.

Question: PRD segment in ORM and ORU messages?


Can the PRD segment be included in the HL7 ORM message and ORU messages?This would allow clear identification of Referring Provider and Consulting Provider.


The PRD segment is not part of the base HL7 definition for either the ORM or ORU messages.

I think that the intent is that you’d exchange the full details of Referring and Consulting Providers via some other means of transfer, such as Master File Messages (see chapter 8 of the HL7 v2 standard).

Of course, that kind of approach won’t work for some of the ways in which ORM and ORU messages are used – e.g. where the sender and receiver aren’t tightly bound in a single institution. So you can add the PRD segement if you want, but you’ll have to ensure that all the parties involved in the exchange know that it will be there and why it’s there. I’d add it after the ORC segment.



FHIR doesn’t use JSON-LD. Some people are pretty critical of that:

It’s a pity hasn’t been made compatible. Enormous missed opportunity for interop & simplicity.

That was from David Metcalfe by Twitter. The outcome of our exchange after that was that David came down to Melbourne from Sydney to spend a few hours with me discussing FHIR, rdf, and json-ld (I was pretty amazed at that, thanks David).

So I’ve spent a few weeks investigating this, and the upshot is, I don’t think that FHIR should use json-ld.

Linked Data

It’s not that the FHIR team doesn’t believe in linked data – we do, passionately. From the beginning, we designed FHIR around the concept of linked data – the namespace we use is http://hl7.org/fhir and that resolves right to the spec. Wherever we can, we ensure that the names we use in that namespace are resolvable and meaningful on the hl7.org server (though I see that recent changes in the hosting arrangements have somehow broken some of these links). The FHIR spec, as a RESTful API, imposes a linked data framework on all implementations.

It’s just a framework though – using the framework to do fully linked data requires a set of additional behaviours that we don’t make implementers do. Not all FHIR implementers care about linked data – many don’t, and the more closely linked to institutional healthcare, the more important specific trading partner agreements become. One of the major attractions FHIR has in the healthcare space is that it can serve as a common format across the system, so supporting these kind of implementers is critical to the FHIR project. Hence, we like linked data, we encourage it’s use, but it’s not mandatory.


This is where json-ld comes into the picture – the idea is that you mark up you json with a some lightweight links, which link the information in the json representation to it’s formal definitions so that the data and it’s context can be easily understood outside the specific trading partner agreements.

We like that idea. It’s a core notion for what we’re doing in FHIR, so it sounds like that’s how we should do things. Unfortunately, for a variety of reasons, it appears that it doesn’t make sense for us to use json-ld.


Many of the reasons that json-ld is not a good fit for FHIR arise because of RDF, which sits in the background of json-ld. From the JSON-LD spec:

JSON-LD is designed to be usable directly as JSON, with no knowledge of RDF. It is also designed to be usable as RDF, if desired, for use with other Linked Data technologies like SPARQL.

FHIR has never had an RDF representation, and it’s a common feature request. There’s a group of experts looking at RDF for FHIR (technically, the ITS WGM RDF project) and so we’ve finally got around to defining RDF for FHIR. Note that this page is editors draft for committee discussion – there’s some substantial open issues. We’re keen, though, for people to test this, particular the generated RDF definitions.

RDF for FHIR has 2 core parts:

  • An RDF based definition of the specification itself – the class definitions of the resources, the vocabulary definitions, and all the mappings and definitions associated with them
  • A method for representing instances of resources as RDF

Those two things are closely related – the instances are represented in terms of the class model defined in the base RDF, and the base RDF uses the instance representation in a variety of ways.

Working through the process of defining the RDF representation for FHIR has exposed a number of issues for an RDF representation of FHIR resources:

  • Dealing with missing data: a number of FHIR elements have a default value, or, instead, have an explicit meaning for a missing element (e.g. MedicationAdministration: if there is no “notGiven” flag, then the medication as given as stated). In the RDF world (well, the ontology world built on top of it) you can’t reason about missing data, since it’s missing. So an RDF representation for FHIR has to make the meaning explicit by requiring default values to be explicit, and providing positive assertions about some missing elements
  • Order does matter, and RDF doesn’t have a good solution for it. This is an open issue, but one that can’t be ducked
  • It’s much more efficient, in RDF, to change the way extensions are represented; in XML and JSON, being hierarchies (and, in XML, and ordered one), having a manifest where mandatory extension metadata (url, type) is represented is painful, and, for schema reasons, difficult. So this data is inlined into the extension representation. In RDF, however, being triple based with an inferred graph, it’s much more effective to separate these into a manifest
  • for a variety of operational reasons, ‘concepts’ – references to other resources or knowledge in ontologies such as LOINC or SNOMED CT – are done indirectly. For Coding, for instance, rather than simply having a URL that refers directly to the concept, we have system + code + version. If you want to reason about the concept that represents, it has to be mapped to the concept directly. That level of indirection exists for good operational reasons, and we couldn’t take it out. However the mapping process isn’t trivial

In the FHIR framework, RDF is another representation like XML and JSON. Client’s can ask servers to return resources or sets of resources using RDF instead of JSON or XML. Servers or clients that convert between XML/JSON and RDF will have to handle these issues – and the core reference implementations that many clients and servers choose to use will support RDF natively (at least, that’s what the respective RI maintainers intend to do).

Why not to use JSON-LD

So, back to json-ld. The fundamental notion of json-ld is that you can add context references to your json, and then the context points to a conversion template that defines how to convert the json to RDF.

From a FHIR viewpoint, then, either the definition of the conversion process is sophisticated enough to handle the kinds of issues discussed above, or you have to compromise either the JSON or the RDF or both.

And the JSON –> RDF conversion defined by the JSON-LD specification is pretty simple. In fact, we don’t even get to the issues discussed above before we run into a problem. The most basic problem has to do with names – JSON-LD assumes that everywhere a JSON property name is used, it has the same meaning. So, take this snippet of JSON:

  "person" : {
    "dob" : "1975-01-01",
    "name" : {
      "family" : "Smith",
      "given" : "Joe"
  "organization" : {
     "name" : "Acme"

Here, the json property ‘name’ is used in 1 or 2 different ways. It depends on what you mean by ‘meaning’. Both properties associate a human usable label to a concept, one that humans use in conversation to identify an entity, though it’s ambiguous. That’s the same meaning in both cases. However the semantic details of the label – meaning at a higher level – are quite different. Organizations don’t get given names, family names, don’t change their names when they get married or have a gender change. And humans don’t get merged into other humans, or have their names changed for marketing reasons (well, mostly 😉 ).

JSON-LD assumes that anywhere that a property ‘name’ appears, it has the same RDF definition. So that snippet above can’t be converted to json-ld by a simple addition of a json-ld @context. Instead, you would have to rename the name properties to ‘personName’ and ‘organizationName’ or similar. In FHIR, however, we’ve worked on the widely accepted practice that names are scoped by their type (that’s what types do). The specification defines around 2200 elements, with about 1500 names – so 700 of them or so use names that other elements also use. We’re not going to rename all these elements to pre-coordinate their type context into the property name. (Note that JSON-LD discussed supporting having names scoped by context – but this is an ‘outstanding’ request that seems unlikely to get adopted anytime soon).

Beyond that, the other issues are not addressed by json-ld, and unlikely to be soon. Here’s what JSON-LD says about ordered arrays:

Since graphs do not describe ordering for links between nodes, arrays in JSON-LD do not provide an ordering of the contained elements by default. This is exactly the opposite from regular JSON arrays, which are ordered by default


List of lists in the form of list objects are not allowed in this version of JSON-LD. This decision was made due to the extreme amount of added complexity when processing lists of lists.

But the importance of ordering objects doesn’t go away just because the RDF graph definitions and/or syntax makes it difficult. We can’t ignore it, and no one getting healthcare would be happy with the outcomes if we managed to get healthcare process to ignore it. The same applies to the issue with missing elements – there’s no facilty to insert default values in json-ld, let alone to do so conditionally.

So we could either

  • Complicate the json format greatly to make the json-ld RDF useful
  • Accept the simple RDF produced by json-ld and just say that all the reasoning you would want to do isn’t actually necessary
    • (or some combination of those two)
  • Or accept that there’s a transform between the regular forms of FHIR (JSON and XML which are very close) and the optimal RDF form, and concentrate on making implementations of that transform easy to use in practice

I think it’s inevitable that we’ll be going for the 3rd.

p.s. should json-ld address these issues? I think JSON-LD has to address the ‘names scoped by types’ issue, but for the rest, I don’t know. The missing element problem is ubiquitous across interfaces – elements with default values are omitted for efficiency everywhere – but there is a lot of complexity in these things. Perhaps there could be an @conversion which is a reference to a server that will convert the content to RDF instead of a @context. That’s not so nice from a client’s perspective, but it avoids specifying a huge amount of complexity in the conversion process.

p.p.s there’s further analysis about this on the FHIR wiki.

#FHIR DSTU ballots this year

Last week, the FHIR Management Group (FMG – the committee that has operational authority over the development of the FHIR standard) made a significant decision with regard to the future of the FHIR specification.

A little background, first. For about a year, we’ve been announcing our intent to publish an updated DSTU – DSTU 2 – for FHIR in the middle of this year. This new DSTU has many substantial improvements across the entire specification, both as a result of implementation experience from the first DSTU, and in response to market and community demand for additional new functionality. Preparing for this publication consists of a mix of activities – outreach and ongoing involvement in the communities and projects implementing FHIR, a set of standards development protocols to follow (internal HL7 processes), and ongoing consultation with an ever growing list of other standards development organizations. From a standards view point, the key steps are two-fold: a ‘Draft for comment’ ballot, and then a formal DSTU (Draft Standard for Trial Use).

  • Draft For comment: really, this is an opportunity to do formal review of the many issues that arose across the project, and a chance to focus on consistency across the specification (We held this step in Dec/Jan)
  • DSTU: This is the formal ballot – what emerges after comment reconciliation will be the final DSTU 2 posted mid-year

In our preparation for the DSTU ballot, which is due out in a couple of weeks time, what became clear is that some of the content was further along in maturity than other parts of it; Some have had extensive real world testing, and others haven’t – or worse, the real world testing that has occurred has demonstrated that our designs are inadequate.

So for some parts of the ballot, it would be better to hold of the ballot and spend more time getting them absolutely right. This was specially true since we planned to only publish a single DSTU, then wait for another 18months before starting the long haul towards a full normative standard. This meant that anything published in the DSTU would stand for at least 2 years, or if it missed out, it would be at least 2 years before making it into a stable version. For this content, there was a real reason to wait, to hold off publishing the standard.

On the other hand, most of the specification is solid and has been well tested – it’s much further along the maturity pathway. Further, there are a number of implementation communities impatient to see a new stable version around which they can focus their efforts, one that’s got the improvements from all the existing lessons learned, and further, one with a broader functionality to meet their use case. The two most prominent communities in this position are Argonaut and HSPC, both of which would be seriously impeded by a significant delay in publishing a new stable version – and neither of which use the portions of the specifications that are behind in maturity.

After discussion, what FMG decided to do is this:

  • Go ahead with the ballot as planned – this meets the interests of the community focused on PHR/Clinical record exchange
  • Hold a scope limited update to the DSTU (planned to be called 2.1) later this year for a those portions of the DSTU that are identified as being less mature

The scope limited update to the DSTU will not change the API, the infrastructure resources, or the core resources such as Patient, Observation etc. During the ballot reconciliation we’ll be honing the exact scope of the DSTU update project. Right now, these are the likely candidates:

  • the workflow/process framework (Order, OrderRequest, and the *Request/*Order resources)
  • The financial management resources

For these, we’ll do further analysis and consultation – both during the DSTU process and after it, and then we’ll we’ll hold a connectathon (probably October in Atlanta) in order to test this.

Canadian FHIR Connectathon

unnamedFHIR® North

Canada’s FHIR Connectathon

Event Details

Date: April 29th

Time: 9:00AM – 6:00PM

Location: Mohawk College, 135 Fennel Ave W, Hamilton, ON L9C 1E9

Room: Collaboratory (2nd Floor – Library)

Registration Cost: $45.00 (Entry, lunch, coffee break, pizza dinner)

Registration Site: http://www.mohawkcollegeenterprise.ca/en/event_list.aspx?groupId=3

Event Description

A FHIR connectathon is an opportunity for developers to come together to test their applications to determine if they can successfully interoperate using the HL7 FHIR specification. Participants will have a chance to meet and ask questions with some of the world’s leading FHIR experts, a chance to see whether FHIR really lives up to the hype and a chance to shape the specification.

Don’t miss out on this opportunity. Register now at http://www.mohawkcollegeenterprise.ca/en/event_list.aspx?groupId=3. For further details regarding this event, please review the pdf attached.

FHIR is attracting significant interest world-wide. Although the standard is still evolving, it’s being used in production in multiple countries. Numerous connectathons have been held in the U.S., Europe, South America and Australasia. It seems time to give Canadian developers a chance to take it out on the road.

FHIR North v1.0 (PDF)

Note: “HL7” and “FHIR” are registered trademarks of Health Level Seven International