Category Archives: CIMI

Profiles and Exceptions to the Rules

One of the key constructs in FHIR is a “profile”. A profile is a statement of how FHIR resources are used for a particular solution – or, how they should be used. The FHIR resources are a general purpose construct, and you can do kind of general purpose things with them, such as store the data in a PHR, and do generally useful display of a clinical record etc.

But if you’re going to do something more specific, then you need to be specific about the contents. Perhaps, for instance, you’re going to write a decision support module that takes in ongoing glucose and HBA1c measurements, and keeps the patient informed about how well they are controlling their diabetes. In order for a patient or an institution to use that decision support module well, the author of the module is going to have to be clear about what are acceptable input measurements – and it’s very likely, unfortunately, that the answer is ‘not all of them’. Conversely, if the clinical record system is going to allow it’s users to hook up decision support modules like this, it’s going to have to be clear about what kind of glucose measurements it might feed to the decision support system.

If both the decision support system and the clinical records system produce profiles, a system administrator might even able to get an automated comparison to see whether they’re compatible. At least, that’s where we’d like to end up.

For now, however, let’s just consider the rules themselves. A clinical record system might find itself in this situation:

  • We can provide a stream of glucose measurements to the decision support system
  • They’ll come from several sources – labs, point of care testing devices, inpatient monitoring systems, and wearables
  • There’s usually one or more intermediary systems between the actual glucose measurement, and the clinical record system (diagnostic systems, bedside care systems, home health systems – this is a rapidly changing space)
  • Each measurement will have one of a few LOINC codes (say, 39480-9: Glucose [Moles/volume] in Venous blood, 41652-9: Glucose [Mass/volume] in Venous blood,
    14743-9: Glucose [Moles/volume] in Capillary blood by Glucometer)
  • the units of measure will be mg/dL or mmol/L
  • there’ll be a numerical value, perhaps with a greater than or less than comparator (e.g. >45mmol/L)

So you can prepare a FHIR profile that says this one way or another. And then a decision support engine can have a feel for what kind of data it might get, and make sure it can handle it all appropriately.

So that’s all fine. But…

Eventually, the integration engineers that actually bring the data into the system discover – by looking at rejected messages (usually) – 1 in a million inbound glucose measurements from the lab contain a text message instead of a numerical value. The message might be “Glucose value to high to determine”. Now what? From a clinical safety perspective, it’s almost certain that the integration engineers won’t replace “too high to determine’ with a “>N” where N is some arbitrarily chosen number – there’s no number they can choose that isn’t wrong. And they won’t be able to get the source system to change their interface either – that would have other knock-on effects for other customers / partners of the source system. Nor can they drop the data from the clinical record – it’s the actual test result. So they’ll find a way to inject that value into the system.

Btw- aside – some of the things that go in this string value could go in Observation.dataAbsentReason, but they’re not coded, and it’s not possible to confidently decide which are missing reasons, and which are ‘text values’. So dataAbsentReason isn’t a solution to this case, though it’s always relevant.

Now the system contains data that doesn’t conform to the profile it claimed to use. What should happen?

  1. The system hides the data and doesn’t let the decision support system see it
  2. The system changes it’s profile to say that it might also send text instead of a number
  3. The system exposes the non-conformant data to the decision support system, but flags that it’s not valid according to it’s own declarations

Neither of these is palatable. I assume that #1 isn’t possible, at least, not as a blanket policy. There’s going to be some clinical safety reason why the value has to be passed on, just the same as the integration engineers passed it on in the first place, so that there’re not liable.

Option #2 is a good system/programmer choice – just tell me what you’re going to do, and don’t beat around the bush. And the system can do this – it can revise the statement ‘there’ll be a numerical value’ to something like ‘there’ll be a numerical value, or some text’. At least this is clear.

Only it creates a problem – now, the consumer of the data knows that they might get a number, or a string. But why might the get a string? what does it mean? Someone does know, somewhere, that the string option is used 1 in a million times, but there’s no way (currently, at least) to say this in the profile – it just says what’s possible, not what’s good, or ideal, or common. If you start considering the impact of data quality on every element – which you’re going to have to do – then you’re going to end up with a profile that’s technically correct but quite non-comunicative about what the data might be, nor one that provides any guidance as to what it should be, so that implementers know what they should do. (and observationally, if you say that it can be a string, then, hey, that’s what the integration engineers will do to, because it’s quicker….)

That’s what leads to the question about option #3: maybe the best thing to do is to leave the profile saying what’s ideal, what’s intended, and let systems flag non-conforming resources with a tag, or wrong elements with an extension? Then the consumer of the information can always check, and ignore it if they want to.

That is, if they know about the flag, and remember. Which means we’d need to define it globally, and the standard itself would have to tell people to check for data that isn’t consistent with it’s claims… and then we’d have to add overrides to say that some rules actually mean what they say, as opposed to not actually meaning that…. it all sounds really messy to me.

Perhaps, the right way to handle this is to have ideal and actual profiles? That would mean an extension to the Conformance resource so you could specify both – but already the interplay between system and use case profiles is not well understood.

I think this area needs further research.

p.s. There’s more than some passing similarity between this case and the game of ‘hot potato‘ I used to play as a kid: ‘who’s going to do have to do something about this bad data’.

#FHIR for Clinical Users

One of the outstanding issues for FHIR has been to make the specification more penetrable for clinical users – or, more precisely, for non-technical users. The framing of what FHIR is made in a technology setting, and if you aren’t familiar with the technologies, then it’s hard to know where to start. I committed to doing something about that, so here’s a short “FHIR for Clinical Users” introduction:

This explains how the API works using a simple analogy. Feedback to continue to improve this is welcome. It will make it’s way into the next version of the specification once we start working on that.

I’d like to thank Josh Mandel, Heather Leslie, Tim Benson, David Hay and Lloyd Mackenzie for contributing to the document.


CIMI at the Crossroads

The Clinical Information Modelling Initiative (CIMI, see here, and here) is

“an international collaboration that is dedicated to providing a common format for detailed specifications for the representation of health information content so that semantically interoperable information may be created and shared in health records, messages and documents”

CIMI is one of a number of efforts that have been started to try and define a common format for such specifications; all the previous efforts (mostly going by the name of DCM, “detailed clinical models”) have gotten bogged down in methodology questions and political games of various sorts, and they’ve failed to produce something that people might actually use.

CIMI shows every sign of following the same trail to the same dead end.

From the beginning, the CIMI initiative sought to produce a different outcome from previous efforts by trying to be agnostic on the tribal and political issues that have bedeviled the previous efforts. In particular:

  • The membership of CIMI included all the significant players in the space, not only some of them
  • The charter always included CIMI providing the capability to express the clinical models in a series of different formalisms (i.e. XML, Java, HL7 v2, EN13606, CDA, openEHR etc) by the provision of some “compiler”

The membership point was really new – and for the first time there was real hope that something might come from this. The first task for CIMI was to choose an internal methodology that would be used as the primary expression of the models. The initiative held a meeting in London in Nov 2011 to choose between the following candidate approaches:

  • UML/OCL and associated OMG standards
  • 13606-2/ADL 1.4
  • ADL 1.5 (
  • Semantic Web technology (OWL, RDF, Protégé, and associated tools and standards)
  •  HL7 v3 approach (MIF, HL7 RIM, static models and associated artifacts and tools)

In spite of the fact that these things are at not all alike, a comparison was performed, and the group decided… well, let’s quote from the press release:

  • ADL 1.5 will be the initial formalism for representing clinical models in the repository.
  • CIMI will use the openEHR constraint model (Archetype Object Model:AOM).
  • A set of UML stereotypes, XMI specifications and transformations will be concurrently developed using UML 2.0 and OCL as the constraint language.
  • A Work Plan for how the AOM and target reference models will be maintained and updated will be developed and approved by the end of January 2012.

In other words, the group chose AOM/ADL, but it seems to me it was unable to get full consensus, hence the mention of UML/OCL. Note that the exact relationship between ADL 1.5 and UML is not spelled out.

Well, January 2012 has passed, and there is no work plan – because there still doesn’t seem to be any consensus about the methodology, let alone the reference model. As far as I can tell, the participants who favour UML/OCL have continued on as if ADL/AOM wasn’t the initial formalism. The follow up meeting  in San Antonio in Jan 2012 was characterised by continued argument about UML vs ADL. CIMI still doesn’t have consensus about the stuff already decided, let alone the hard stuff to come.

I’ve been an interested observer to CIMI from the beginning – it’s a great goal that we really need to see solved, the best group of people that we’ve got together on this subject, and there was real hope. Due to resource constraints, I’ve never been a formal member of the initiative, but I have attended the CIMI meetings and teleconferences whenever possible. But it’s never seemed to me that the participants are being realistic.

The core problem revolves around the problem of getting compromise. This was obviously going to be a problem here – many of the participants at CIMI have many millions invested in their systems, and I never could see how CIMI would avoid the outcome I described:

…build a complicated framework that allows both solutions to be described within the single paradigm, as if there isn’t actually contention that needs to be resolved, or that this will somehow resolve it. This is expensive – but not valuable; it’s just substituting real progress with the appearance thereof.

As you can see, CIMI is well on the way to building a complicated framework, and providing only the appearance of progress.

For me, this was underscored by the decision to choose ADL/AOM as the methodology, while deferring the choice of reference model. While I understood the political reality of this decision, choosing an existing methodology (ADL/AOM) but not the openEHR reference model committed CIMI to building at least a new tooling chain, a new community, and possibly a new reference model.

Each of these is spectacularly hard and expensive. At a minimum, using semi-volunteer labour of loving experts who are building their own empire, you have to estimate the cost of tooling at great than $2M (and doing it on a straight commercial basis, upwards of $6M). Reference models take years – as in, a decade – to build, and the blood, sweat and tears of many people. This also equates to millions of dollars one way or another. Building a community around a methodology and tool-chain are the same. So CIMI committed itself to these kind of expenditures of $$, energy and ego, but I can’t think that any of the participants really thought that CIMI can actually call on those resources before it produces anything of value.

As for UML, the plan called for “a set of UML stereotypes, XMI specifications and transformations” – this is the same error. The point of UML is that the average implementer knows how to make it work, and has tools that can leverage the models. Each stereotype you define erodes that advantage, and as soon you define a really important stereotype – and why bother if it’s not? – then off the shelf tools can no longer be used. As for developing XMI specifications… who’s going to support that? This is known as “snatching defeat from the jaws of victory”.

I can’t see that CIMI is on a path to producing anything, let alone a methodology that people will be happy to use, offered the choice.

So what should CIMI do? As I see it, there are two pragmatic choices. CIMI needs to pick one, or accept that it’s never going to reach consensus with the resources available:


That’s right. Just bite the bullet and pick the whole openEHR stack. They’ve got a reference model. They’ve got tooling – the archetype designer (open source!) and the CKM. They’ve got a community (using the CKM). They’ve got runs on the board with published models. It’s there, waiting to go.

I recognise that simply picking openEHR holus-bolus like this is extremely distasteful to many people.  OpenEHR is still missing a few things across the stack, and the reference model is too EHR-specific rather than being a general clinical model – and it seems unlikely that CIMI has the resources to change these things, so we’d just have to live with the way it as and work with them. And of course, there’s a series of personal and political factors.

This is the first choice: pick the least worst established clinical modelling paradigm.


The second option is to abandon any hope of a clinical-friendly modelling tool, and bite the bullet by adopting UML. This is the IT centric solution. But if you’re going to do this, do it is simply as possible. No fancy stereotypes. In fact, no enforcing of a reference model (it’s the reference model that complicates everything). Given the fragility of the UML tools (i.e. total lack of interoperability between tools), CIMI should ban anything other than classes, attributes, and associations. No stereotypes, no properties, no profiles. That’d mean a lot of missing functionality – but we’d just have to live that and work around it.

The real price of this isn’t that UML isn’t clinical-friendly, it’s the reference model. Given the cost of creating a reference model, and the fact the existing reference models aren’t created to be used in such a brutally simplistic way, this approach involves abandoning a serious reference model – and that’s exactly what some of the participants want, not understanding what it is that a reference model achieves.

Hybrid Models

From the beginning, CIMI has wanted to explore the auto-generation of multiple formats – CDA templates, v2, openEHR, java, xml, etc. Java, various forms of XML – that makes perfect sense. But the others? I spend quite a bit of time converting models and/or instances between the CDA, v2, and openEHR worlds, and they’re not just alternative syntaxes – they have completely different ways of understanding the world (or not, for v2). Real human input is required to effect these transforms. In the end, any auto-generation facility would become a transparent syntax conversion layer, and the CIMI models would have to contain the expression of the model in each of the target formalisms. It’s hard enough to model against one paradigm, let alone all 3 (or more). Whatever CIMI produced from this path would be a methodology that very few uber-experts could make use of. This isn’t an option.

Neither of the two choices are really palatable choices. But what other practical choices exist? CIMI is at a crossroads – it needs to pick something that will work.

p.s. Actually, several people have pointed out to me that FHIR might be a logical choice for CIMI – but FHIR’s got the slight problem that it doesn’t actually exist yet, so I’m not going push that forward for CIMI. Yet.