Category Archives: ballot issue

#FHIR – Adding Identifier to Reference

One of the more controversial sessions at the Baltimore meeting was where we discussed task 10659:  Reference should support logical references

This is a difficult subject because it relates to handling and exchanging information where a formal URL based view of the world is incomplete. In effect, that means, from a purist point of view, the world is already broken. This makes effective resolution difficult; at best, the resolution is also going to be broken – but is it broken in the least unuseful way? – that’s a bad place to try and get consensus from.

Further, in the FHIR framework, the Reference data type is clearly FMM level 5 – it’s been hammered everywhere from the beginning and is production in all sorts of places. And we’ve said that we won’t make breaking changes to FMM level 5 artefacts without consultation with the users. What’s not clear, though, is whether any of the changes that were proposed in the extensive discussion leading up to the quarter devoted to the task (see, for example, here and here) were actually breaking changes.

The outcome of the meeting was that we would add an Identifier to the Reference data type. Given the background, there was fairly strong consensus for the change, but it wasn’t complete. As a consequence, we said that we would define the change, let everyone look at it for a month, and see if any argument comes up that we had not considered.

So I’ve made the changes: see the Reference data type in the current build. Implementers are encouraged to evaluate the impact of this addition in their implementations, and share their findings on the FHIR email list.

Note: since this is not strictly a breaking change, this is not strictly consulting the implementers before making a breaking change 😉

#FHIR: Comments in JSON

We use XML comments extensively in FHIR for the examples in the specification:

  <Patient xmlns="">
    <!-- this example patient shows how to use ... -->

Generally, comments are only useful in examples for the specification and implementation guides.

Unlike XML, JSON doesn’t have a comments syntax. Standing advice is to put the comment in some property, so that’s what we did:

    "resourceType" : "Patient",
    "fhir_comments" : "this example patient shows how to use ..."

But there’s problems with that approach – you’re parser has to parse them, and many people have trouble with the idea that you can truly ignore them. Then, if you’re using JSON schema, you have to allow for them. Also, if you’re round-tripping with XML, you lose the exact position of the comment. Finally, must human readers miss the comment anyway, since it’s in a property – and the human readers are the only point of having comments. (So many people use custom variations to include comments, but that’s not an option we have. The JSON community is discussing that, but it seems unlikely it will change, or if it changes, be widely adopted).

The cumulative effect of all this is that, as a consequence of a ballot comment (disclosure: from me) the HL7 ITS committee proposes to remove support for comments from the FHIR JSON format. Specifically, we propose to remove this paragraph:

There is no inherent support in JSON for a comment syntax. As a convention, content that would be comments in an XML representation is represented in a property with the name fhir_comments, which is an array of strings, which can appear on any JSON object. This is heavily used in example instances, e.g. in this specification, but not usually used in production systems (and production systems may choose to reject resources with comments in them)

We don’t know of anyone using this in practice, and for practical reasons, I removed comments from the actual JSON examples in the current version earlier this year, and we’ve had no comments about that. Never-the-less, the JSON format is labelled FMM 5 (see the specification definition), so we need to consult with the implementation community about this.

So we are calling for comment from implementers about this change. If you wish to comment on this change, please send an email to the FHIR email list before Octover 11th 2016. If you comment against this change, make sure you identify the production implementation that would be affected by the change, and how it would be affected.

Call for comments: Logical references

Two tasks have been created from the current FHIR Ballot: 10354, and 10659. Both propose the same basic idea: to allow a reference to refer to an item by it’s logical identifier rather than a direct URL. This is a pretty significant change, so we’re calling for comments from the FHIR Implementer community.

The basic idea is pretty simple. A reference from one resource to another is a URL that identifies where the content of the resource can be found. For example, Observations nominate the patient that the observation is made on:

  <Observation xmlns="">
      <reference value=""/>

This means that the patient resource can be accessed at the given URL. A reference can also be relative, like this:

  <Observation xmlns="">
      <reference value="Patient/example"/>

This means that patient resource location is relative to the end-point at which the Observation was found. Both these approaches assume that the author of the resource knows exactly where the patient resource is found. In a purely RESTful environment, that’s a good safe assumption. But the FHIR community is encountering a number of situations where the address at which a resource is located is not know, even they key identifying information about the target is. In the case in point, this means that the application has a patient identifier, but it doesn’t know where the server for the patient is (or it doesn’t exist) and even if it does, the patient resource id the server uses it isn’t the same as the patient identifier, and there is no obvious conversion (in fact, it’s quite often that the patient identifier is not sufficiently reliable to use in the URL – they change…).

So the proposal is that FHIR should change so that this is valid:

  <Observation xmlns="">
        <system value=""/>
        <value value="5551234567"/>

This is a reference to a patient by their identifier. It’s the responsibility of the reader of the resource to decide how to resolve this reference to a resource (or some other kind of resource) – if they can. Or to decide what to do about it if there’s multiple candidate matches – which is also possible.

This is a fairly significant change. There’s already resources where this pattern is baked in by using a data type choice – see, for example, ExplanationOfBenefit:


Generalising this pattern so it applies anywhere there’s a reference – without having to be explicit in the design as above – is both good and bad, depending on who you are. Generally, it allows a resource writer to move work to the resource reader, but it also allows a resource writer to write resources it otherwise couldn’t?

We’re calling for comments on this issue from implementers, to help us decide. Please comment on the second task (10659)

FHIR Issue: Invariants based on dataAbsentReason

This is a ballot comment made against the FHIR specification:

The presense of a DAR is used in several cases in the datatypes as a means of loosening the rules for what datatype properties need to be present. However, this is mixing two things. DAR is relevant when there’s a use-case for knowing why an element is missing. This is a distinct use-case from choosing to allow partial or incomplete data. For example, I might want to allow a humanId that doesn’t allow unique resolution without wanting to capture “why the id isn’t fully specified”. We need to separate “Partial data allowed” from “reason for absent/incomplete data allowed”.


In the v3 data types (+ISO 21090) you can label a data type as “mandatory”. If you do so, it must be present, and it must have a proper value. Specifically, this means that it must be not be null – there must be no nullFlavor, either applied explicitly, or implied by simply leaving the attribute right out of the an XML representation. Each type definition can link into this rule and make extra rules about what other data type attributes must have values if there’s no nullFlavor. For instance, with the type II, which has a root and an extension, if there’s no nullFlavor, there must be at least a root:

invariant(II x) where x.nonNull { root.nonNull; };

By implication, the root or root/extension must also be globally unique: this must be a proper identifier. This system makes it easy to say that an instance has to have a proper identifier for something: simply label the id : II attribute as mandatory.

FHIR follows this same pattern, though the presentation is different. When you include an element in a resource, you can indicate a minimum cardinality, and say whether a dataAbsentReason (which equates to a nullFlavor) is allowed.

 <identifier d?><!-- 1..1 Identifier A system id for this resource --></identifier>

This says that the resource must have an identifier, but it can have a dataAbsentReason. So you could do something like this:


Ok, an identifier. But you could also do this:

 <identifier dataAbsentReason="notasked">  

This indicates that the identifier (a national healthcare one in this case) simply wasn’t asked. So, how does a resource definition say that there must be an identifier – that you can’t get away with providing an incomplete identifier? like this:

 <identifier><!-- 1..1 Identifier A system idfor this resource --></identifier>

Because the identifier doesn’t allow a dataAbsentReason (no “d?”), the second form is not allowed. Only, what stops this following form from being allowed?:


The answer is this constraint made on the Identifier type:

Unless an Identifier element has a dataAbsentReason flag, it must contain an id (xpath: exists(@dataAbsentReason) or exists(f:id))

Response to Comment

The issue that the commenter has is that two separate ideas are conflated: whether you can allow incomplete data, and whether you need to say why incomplete data is provided. These are two different things, but we always conflated them in v3. And we did that because it’s easy: if you unhook these things, then it’s much more difficult to say that a proper value (i.e. identifier) must be provided. Instead of saying that you just can’t provide a dataAbsentReason, in addition (or alternatively), you have to define what a ‘proper’ that is required is – and potentially, therefore, how this relates to the expected use of dataAbsentReason. This will be much more complicated than the current system.

So there are two separate things to discuss with relation to this comment:

  • Does the case – providing incomplete data without having to/being able to provide a reason for incomplete data – justify making the implementation experience much more complicated?
  • If it does, how much would you provide these rules must effectively

1. Is the case justified?

I don’t think it is – it’s never come up all my v3 experience, either in my own implementations, in my experience as the go-to guy for the v3 data types, or in committee. I’m pretty sure I would remember it if it had. Why not just use unknown? I guess there’s some fractional use case where it might not be unknown, but you can’t say what it is, and you can’t use some other dataAbsentReason. Maybe we should add a dataAbsentReason “?” for use in this case?

2. How else to specify the rules

Well, in the end, the rules are specified by XPath (exists(@dataAbsentReason) or exists(f:id)) – this is what enforcement is based on. So the most obvious thing is to take this out of the Identifier datatype and push it to the resource definition. We’re going to get XPath all over the place… I think that this is a real cost for the implementers.

An alternative approach is to define a profile for each data type that says what a “proper” value is. This offers re-use and flexibility, but would mean that a key aspect of many resources – basic pre-conditions for validity – is moved to somewhere more obscure, which will make for more complexity. (note that you can’t profile data types at the moment – the commenter made a request to be able to profile data types as well, which we had not allowed at this point because the potential complexity seemed unjustified)


This is a FHIR ballot issue. Comments on this blog post are discouraged – unless to point out outright errors in this analysis. Discussion will be taken up on the FHIR email list, with substantial contributions added to this wiki page. A doodle poll will be held for the final vote – this will be announced on the FHIR email list.