Category Archives: ISO 21090

Language Localization in #FHIR

Several people have asked about the degree of language localization in FHIR, and what language customizations are available.


The specification itself is published and balloted in English (US). All of our focus is on getting that one language correct.

There are a couple of projects to translate this to other languages: Russian  and  Japanese, but neither of these have gone very far, and to do it properly would be a massive task. We would consider tooling in the core build to make this easier (well, possible) but it’s not a focus for us.

One consequence of the way the specification works is that the element names (in either JSON or XML) are in english. We think that’s ok because they are only references into the specification; they’re never meant to be meaningful to any one other than a user of the specification itself.

Codes & Elements

What is interesting to us is publishing alternate language phrases to describe the codes defined in FHIR code systems, and the elements defined in FHIR resources. These are text phrases that do appear to the end-user, so it’s meaningful to provide these.

A Code System defined in a value set has a set of designations for each code, and these designations have a language on them:

code system

Not only have we defined this structure, we’ve managed to provide Dutch and German translations for the HL7 v2 tables. However, at present, these are all we have. We have the tooling to add more, it’s just a question of the HL7 Affiliates deciding to contribute the content.

For the elements (fields) defined as part of the resources, we’d happily add translations of these too, though there’s no native support for this in the StructureDefinition, so it would need an extension. However no one has contributed content like this yet.

It’s not necessary to do this as part of the specification (though doing it there provides the most visibility and distribution), though we haven’t defined any formal structure for a language pack. If there’s interest, this is something we could do in the future.

Useful Phrases

There’s also a source file that has a set of common phrases – particularly error phrases (which do get displayed to the user) – in multiple languages. This is available as part of the specification. Some of the public test servers use these messages when responding to requests.

Note that we’ve been asked to allow this to be authored through a web site that does translations – we’d be happy to do support this, if we found one that had an acceptable license, could be used for free, and had an API so we could download the latest translations as part of the build (I haven’t seen any service that has those features)

Multi-language support in resources

There’s no explicit support for multi-language representation in resources other than for value set. Our experience is that while mixed language content occurs occasionally, it’s usually done informally, in the narrative, and rarely formally tracked. So far, we’ve been happy to leave that as an extension, though there is ongoing discussion about that.

The one place that we know of language being commonly tracked as part of a resource is in document references, with a set of form letters (e.g. procedure preparation notes) in multiple languages. For this reason, the DocumentReference resource as a language for the document it refers to (content.language).

Observation of Titers in HL7 Content

Several important diagnostic measures take the form of a Titer. Quoting from Wikipedia:

A titer (or titre) is a way of expressing concentration. Titer testing employs serial dilution to obtain approximate quantitative information from an analytical procedure that inherently only evaluates as positive or negative. The titer corresponds to the highest dilution factor that still yields a positive reading. For example, positive readings in the first 8 serial twofold dilutions translate into a titer of 1:256 (i.e., 2−8). Titers are sometimes expressed by the denominator only, for example 1:256 is written 256.

A specific example is a viral titer, which is the lowest concentration of virus that still infects cells. To determine the titer, several dilutions are prepared, such as 10−1, 10−2, 10−3, … 10−8.

So the higher the titer, the higher the concentration. 1:2 means a lower concentration than 1:128 (note that this means the clinical intent is the opposite of the literal numeric intent – as the titre gets lower, the concentration gets higher).

Titers are pretty common in clinical diagnostics – I found about 2600 codes for titer type tests in LOINC v2.48 (e.g. Leptospira sp Ab.IgG).

Representing Titers in HL7 Content

In diagnostic reports, titers are usually presented in the text narrative (or the printed form) using the form 1:64, since this makes clear the somewhat arbitrary nature of the numbers in the value. However it’s not unusual for labs to report just the denominator (e.g. “64”) and the person interpreting the reports is required to understand that this is a titer test (this is usually stated in the name).

When it comes to reporting a Titer in structured computable content, there’s several general options:

  • represent it as a string, and leave it up to the recipient to parse that if they really want
  • represent it as an integer, the denominator
  • use a structured form for representing the content

Each of the main HL7 versions (v2, CDA, and FHIR) offer options for each of these approaches:

String Integer Structured
V2 OBX||ST|{test}||1:64 OBX||NM|{test}||64 OBX||SN|{test}||^1^:^64
CDA <value xsi:type=”ST”> 1:64 </value> <value xsi:type=”INT”> 1:64 </value> <value xsi:type=”RTO_INT_INT”> <numerator value=”1”/> <denominator value=”64”> </value>
FHIR “valueString “ : “1:64” “valueInteger” : ”64” “valueRatio”: { “numerator” : { “value” : “1” }, “denominator” : { “value” : “64” } }

(using the JSON form for FHIR here)

One of the joys of titres is that there’s no consistency between the labs – some use one form, some another. A few even switch between representations for the same test (e.g. one LOINC code, different forms, for the same lab).

This is one area where there would definitely be some benefit – to saying that all labs should use the same form. That’s easy to say, but it would be really hard to get the labs to agree, and I don’t know what the path to pushing for conformance would be (in the US, it might be CLIA; in Australia, it would be PITUS; for other countries, I don’t know).

One of the problems here is that v2 (in particular) is ambiguous about whether OBX-5 is for presentation or not. It depends on the use case. And labs are much more conservative about changing human presentation than changing computable form – because of good safety considerations. (Here in Australia, the OBX05 should not be used for presentation, if both sender and receiver are fully conformant to AS 4700.2, but I don’t think anyone would have any confidence in that). In FHIR and CDA, the primary presentation is the narrative form, but the structured data would become the source of presentation for any derived presentation; this is not the primary attested presentation, which generally allays the lab’s safety concerns around changing the content.

If that’s not enough, there’s a further issue…

Incomplete Titers

Recently I came across a set of lab data that included the titer “<1:64”. Note that because the intent of the titre is reversed, it’s not perfectly clear what this means. Does this mean that titre was <64? or that the dilution was greater than 64. Well, fortunately, it’s the first. Quoting from the source:

There are several tests (titers for Rickettsia rickettsii, Bartonella, certain strains of Chlamydia in previously infected individuals, and other tests) for which a result that is less than 1:64 is considered Negative.  For these tests the testing begins at the 1:64 level and go up, 1:128, 1:256, etc.   If the 1:64 is negative then the titer is reported as less than this.

The test comes with this sample interpretation note:

Rickettsia rickettsii (Rocky Mtn. Spotted Fever) Ab, IgG:

  • Less than 1:64: Negative – No significant level of Rickettsia rickettsii IgG Antibody detected.
  • 1:64 – 1:128: Low Positive – Presence of Rickettsia rickettsii IgG Antibody detected
  • 1:256 or greater: Positive – Presence of Rickettsia rickettsii IgG Antibody, suggestive of recent or current infection.

So, how would you represent this one in the various HL7 specifications?

String Integer Structured
V2 OBX||ST|{test}||<1:64 {can’t be done} OBX||SN|{test}||<^1^:^64
CDA <value xsi:type=”ST”> &lt;1:64 </value>  {can’t be done} <value xsi:type=”IVL_RTO_INT_INT”> <high> <numerator value=”1”/> <denominator value=”64”> </high> </value>
FHIR “valueString “ : “<1:64” {can’t be done} “valueRatio”: { “numerator” : { “comparator” : “<”, “value” : “1” }, “denominator” : { “value” : “64” } }

This table shows how the stuctured/ratio form is better than the simple numeric – but there’s a problem: the CDA example, though legal in general v3, is illegal in CDA because CDA documents are required to be valid against the CDA schema, and IVL_RTO_INT_INT was not generated into the CDA schema. I guess that means that the CDA form will have to be the string form?



Explaining the v3 II Type again

Several conversations in varying places make me think that it’s worth explaining the v3 II type again, since there’s some important subtleties about how it works that are often missed (though by now the essence of it is well understood by many implementers). The II data type is “An identifier that uniquely identifies a thing or object“, and it has 2 important properties:

root A unique identifier that guarantees the global uniqueness of the instance identifier. The root alone may be the entire instance identifier.
extension A character string as a unique identifier within the scope of the identifier root.

The combination of root and extension must be globally unique – in other words, if any system ever sees the same identifer elsewhere, it can know beyond any doubt that this it is the same identifier. In practice, this fairly simple claim causes a few different issues:

  • How can you generate a globally unique identifier?
  • The difference between root only vs. root+extension trips people up
  • How do you know what the identifier is?
  • What if different roots are used for the same thing?
  • The difference between the identifier and the identified thing is an end-less source of confusion
  • How to handle incompletely known identifiers

How can you generate a globally unique identifier? 

At first glance, the rule that you have to generate a globally unique identifier that will never be used elsewhere is a bit daunting. How do you do that? However, in practice, it’s actually not that hard. The HL7 standard says that a root must be either a UUID, or an OID.


Wikipedia has this to say about UUIDs:

Anyone can create a UUID and use it to identify something with reasonable confidence that the same identifier will never be unintentionally created by anyone to identify something else. Information labeled with UUIDs can therefore be later combined into a single database without needing to resolve identifier (ID) conflicts.

UUIDs were originally used in the Apollo Network Computing System and later in the Open Software Foundation‘s (OSF) Distributed Computing Environment (DCE), and then in Microsoft Windows platforms as globally unique identifiers (GUIDs).

Most operating systems include a system facility to generate a UUID, or, alternatively, there are robust open source libraries available. Note that you can generate this with (more than) reasonable confidence that the same UUID will never be generated again – see Wikipedia or Google.

Note that UUIDs should be represented as uppercase (53C75E4C-B168-4E00-A238-8721D3579EA2), but lots of examples use lowercase (“53c75e4c-b168-4e00-a238-8721d3579ea2”). This has caused confusion in v3, because we didn’t say anything about this, so you should always use case-insensitive comparison. (Note: FHIR uses lowercase only, per the IETF OID URI registration).

For people for whom a 10^50 chance of duplication is still too much, there’s an alternative method: use an OID


Again, from Wikipedia:

In computing, an object identifier or OID is an identifier used to name an object (compare URN). Structurally, an OID consists of a node in a hierarchically-assigned namespace, formally defined using the ITU-T‘s ASN.1 standard, X.690. Successive numbers of the nodes, starting at the root of the tree, identify each node in the tree. Designers set up new nodes by registering them under the node’s registration authority.

The easiest way it illustrate this is to show it works for my own OID,

1 ISO OID root
2 ISO issue each member (country) a number in this space
36 Issued to Australia. See – you can use your Australian Company Number (ACN) in this OID space
146595217 Health Intersections ACN

All Australian companies automatically have their own OID, then. But if you aren’t an Australian company, you can ask HL7 or HL7 Australia to issue you an OID – HL7 charges $100 for this, but if you’re Australian (or you have reasons to use an OID in Australia), HL7 Australia will do it for free. Or you can get an OID from anyone else who’ll issue OIDs. In Canada, for instance, Canada Health Infoway issues them.

Within that space, I (as the company owner) can use it how I want. Let’s say that I decide that I’ll use .1 for my clients, I’ll give each customer a unique number, and then I’ll use .0 in that space for their patient MRN, then an MRN from my client will be in the OID space “″

So as long as each node within the tree of nodes is administered properly (each number only assigned/used once), then there’ll never be any problems with uniqueness. (Actually, I think that the chances of the OID space being properly administered are a lot lower than 1-1/10^50, so UUIDs are probably better, in fact)

Root vs Root + extension

In the II data type, you must have a root attribute (except in a special case, see the last point below). The extension is optional. The combination of the 2 must be unique. So, for example, if you the root is a GUID, and each similar object simply gets a new UUID, then there’s no need for a extension:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2"/>

If, on the other hand, you’re going to use “53C75E4C-B168-4E00-A238-8721D3579EA2” to identify the set of objects, then you need to assign each object an extension:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2" extension="42"/>

You’ll need some other system for uniquely assigning the extension to an object. Customarily, a database primary key is used.

With OIDs, the same applies. If you assign a new OID to each object (as is usually done in DICOM systems for images), then all you need is a root:

<id root="1.2.840.113663.1100.16233472.1.911832595.119981123.1153052"/> <!-- picked at random from a sample DICOM image -->

If, on the other hand, you use the same OID for the set of objects, then you use an extension:

<id root="1.2.840.113663.1100.16233472.1.911832595.119981123" extension="1153052"/>

And you assign a new value for the extension for each object, again, probably using a database primary key.

Alert readers will now be wondering, what’s the difference between those two? And the answer is, well, those 2 identifiers are *not* the same, even though they’re probably intended to be. When you use an OID, you have to decide: are sub-objects of the id represented as new nodes on the OID, or as values in the extension. That decision can only be made once for the OID. So it’s really useful how only about 0.1% of OIDs actually say in their registration (see below for discussing registration). So why even allow both forms? Why not simply say, you can’t use an extension with an OID? The answer is because not every extension is a simple number than can be in an OID. It’s common to see composite identifiers like “E34234” or “14P232344” or even simply MRNs with 0 prefixed, such as 004214 – none of these are valid OID node values.

Note for Australian readers: the Health Identifier OID requires that IHIs be represented as an OID node, not an extension, like this: Not everyone likes this (including me), but that’s how it is.

The notion that root + extension should be globally unique sounds simple, but I’ve found that it’s real tricky in practice for many implementers.

Identifier Type

When you look at an identifier, you have no idea what type of identifier it is:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2" extension="42"/>

What is this? You don’t know either what kind of object it identifies, or what kind of identifier it is. You’ll have to determine that from something else than the identifier – the II is *only* the identifier, not anything else. However it might be possible to look it up in a registry. For instance, the IHI OID is registered in the HL7 OID registry. You can look it up here. That can tell you what it is, but note two important caveats:

  • There’s no consistent definition of the type – it’s just narrative
  • There’s no machine API to the OID registry

For this reason, that’s a human mediated process, not something done on the fly in the interface. I’d like to change that, but the infrastructure for this isn’t in place.

So for a machine, there’s no way to determine what type of identifier this is from just reading the identifier – it has to be inferred  from context – either the content around the identifier, or reading an implementation guide.

Different Roots for the same thing?

So what stops two different people assigning different roots to the same object? In this case, you’ll get two different identifiers:

<id root="53C75E4C-B168-4E00-A238-8721D3579EA2" extension="42"/>
<id root="" extension="42"/>

Even though these are same thing. What stops this happening?

Well, the answer is, not much. The idea of registering identifiers is actually to try and discourage this – if the first person registers the identifier, and the second person looks for it before registering it themselves, then they should find it, and use the same value. but if they’re lazy, or the description and/or the search is bad, then there’ll be duplicates. And, in fact, there are many duplicates in the OID registry, and fixing them (or even just preventing them) is beyond HL7.

Difference between identifier and identified thing

In theory, any time you see the same object, it should have the same identity, and any time you see the same identifier, it should be the same object. Unfortunately, however, life is far from that simple:

  • The original system might not store the identifier, and it’s just assigned on the fly by middleware. Each time you see the same object, it’ll get a new identity. Personally, I think this is terrible, but it’s widely done.
  • The same identifier (e.g. lab report id) might be used for multiple different representations of the same report object
  • The same identifier (e.g. Patient MRN) might be used for multiple different objects that all represent the same real world thing
  • The same identifier (e.g. lab order number) might be used for multiple different real world entities that all connect to the same business process

R2 of the datatypes introduced a “scope” property so you could identify which of these the identifier is, but this can’t be used with CDA.

Incomplete identifiers

What if you know the value of the identifier, but you don’t know what identifier it actually is? This is not uncommon – here’s two scenarios:

  • A patient registration system that knows the patient’s driver license number, but not which country/state issued it
  • A point of care testing device that knows what the patient bar-code is, but doesn’t know what institution it’s in

In both these cases, the root is not known by the information provider. So in this case, you mark the identifer as unknown, but provide the extension:

<id nullFlavor="UNK" extension="234234234"/>

Note: there’s been some controversy about this over the years, but this is in fact true, even if you read something else somewhere else.

Error in R2 Datatypes

Today, Rick Geimer and Austin Kreisler discovered a rather egregious error in the Data Types R2 specifications. The ISO data types say:

This specification defines the following extensions to the URN scheme:

hl7ii – a reference to an II value defined in this specification. The full syntax of the URN is urn:hl7ii:{root}[:{extension}] where {root} and {extension} (if present) are the values from the II that is being referenced. Full details of this protocol are defined in the HL7 Abstract Data Types Specification

Reference: Section of the ISO Data types

However, if you look through the Abstract Data Types R2 for information about this, you won’t find anything about hl7ii. But there is this, in section

The scheme hl7-att is used to make references to HL7 Attachments. HL7 attachments may be located in the instance itself as an attachment on the Message class, or in some wrapping entity such as a MIME package, or stored elsewhere.

The following rules are required to make the hl7-att scheme work:

  1. Attachments SHALL be globally uniquely identified. Attachment id is mandatory, and an ID SHALL never be re-used. Once assigned, an attachment id SHALL be accosiated with exactly one byte-stream as defined for
  2. When receiving an attachment, a receiver SHOULD store that attachment for later reference. A sender is not required to resend the same attachment if the attachment has already been sent.
  3. Attachment references SHALL be resolved against all stored attachments using the globally unique attachment identifier in the address.

(p.s. references are behind HL7 registration wall, but are free)

These are meant to be the same thing, but they are named differently: urn:hl7ii:… vs hl7-att:…

I guess this will have to be handled as a technical correction to one of the two specifications. I prefer urn:hl7ii:.

Comments welcome.

Question: What does an empty name mean?

I was recently asked whether this fragment is legal CDA:


The answer is that it’s legal, but I have no idea what it should be understood to mean.

It’s legal

Firstly, it’s legal. There’s a rule in the abstract data types specification:

invariant(ST x) where x.nonNull { 

In other words: if a string is not null, it must have at least one character content in it. In a CDA context, then, this would be illegal:


Aside: the schema doesn’t disallow this, and few schematrons check this. But it’s still illegal. It’s a common error to enounter where a CDA document is generated using some object model or an xslt – it’s not so common where the CDA document is generated by some xml scripting language a la PHP.

However the definition of PN (the type of <name/>) is not a simple string: it’s a list of part names, each of which is a string. PN = LIST<ENXP>. So while this is illegal:


– because the given name can’t be empty. This isn’t:


That’s because a list is allowed to be empty.

Aside: I don’t know whether this is legal:

    <given> </given>

 That’s because we run into XML whitespace processing issues. Nasty, murky XML. Next time, we’ll use JSON.

What does it mean?

So what does this mean?


Well, it’s a name with no parts that is not null. Literally: this patient has a name, and the name has no parts.

Note the proper use of nullFlavors:

  • The patient’s name is unknown: <name nullFlavor=”UNK”/>
  • We didn’t bother collecting the patient name: <name nullFlavor=”NI”/>
  • We’re telling you that we’re not telling you the patient name: <name nullFlavor=”MSK”/>(unusual – usually you’d just go for NI when suppressing the name)
  • The patient’s name can’t be meaningful in this context: <name nullFlavor=”NA”/> – though I find operational uses of NA very difficult to understand
  • The patient doesn’t have a name: <name nullFlavor=”ASKU”/> – because if you know that there’s no name, you’ve asked, and it’s unknown. Note that it would be a very unusual circumstance where a patient doesn’t have a working name (newborns or unidentified unconscious patients get assigned names), but it might make much more sense for other things like drugs

So this is none of these. It’s a positive statement that the patient has a name, and the name is… well, empty? I can’t think of how you could have an empty name (as opposed to having no name). I think that we should probably have a rule that names can’t have no parts either.

It’s not clear that this is a statement that the name is empty though. Consider this fragment:


If you saw this in a CDA document, would you think that you should understand that I have no middle name (I do, though I have friends that don’t). We could be explicit about that:

    <given nullFlavor="ASKU"/>

Though I  think that ASKU is getting to be wrong here – you could say that the middle name is unknown, but it would be better to say that the middle name count is 0 – it’s just that we didn’t design the name structure that way because of how names work in other cultures than the English derived one. (which, btw, means that some of this post may be inapplicable outside the english context).

The first case would be normal, so this means, we don’t say anything about middle names. So why would not including any given name or family name mean anything more than “we don’t say anything about the name”? And, in fact, that’s the most likely cause of the form – nothing is known, but the developer put the <name> element in irrespective of the fact that the parts weren’t known.

So it’s quite unclear what it should mean, it most likely arises as a logic error, and I recommend that implementations ensure that an empty name never appears on the wire.

Guest Post: HL7 Language Codes

This guest post is written by Lloyd McKenzie.  I’ve been meaning to get to it since the January WGM, but I’ve been wrapped up in other things (most recently HIMSS).  However, I agree with what Lloyd says.

Question: I have a need to communicate a wide variety of language codes in HL7 v3 instances, but the ISO Data Type 21090 specification declares that ED.language (and thus ST.language) are constrained to IETF 3066.  This is missing many of the languages found in ISO 639-3 – which I need.  Also, IETF 3066 is deprecated.  It’s been replaced twice.  Can I just use ISO 639-3 instead?


The language in the 21090 specification was poorly chosen.  It explicitly says “Valid codes are taken from the IETF RFC 3066”.  What it should have said is “Valid codes are taken from IETF language tags, the most recent version of which at the time of this publication is IETF RFC 3066”.  (Actually, by the time ISO 21090 actually got approved, the most recent RFC was 4646, but we’ll ignore that for now.)  This should be handled as a technical correction, though that’s not terribly easy to do.  However, implementers are certainly welcome to point to this blog as an authoritative source of guidance on ISO 21090 implementation and make use of any language codes supported in subsequent versions of the IETF Language Tags code system – including RFC 4646 and RFC 5646 as well as any subsequent version there-of.

The RFC 5646 version incorporates all of the languages found in ISO 639-3 and 639-5.  However, be aware that while all languages are covered, there are constraints on the codes that can be used for a given language.  Specifically, if a language is represented in ISO 639-1 (2-character codes), that form must be used.  The 3-character variants found in ISO 639-2 cannot be used.  For example, you must send “en” for English, not “eng”.

Question: But I want to send the 3-character codes.  That’s what my system stores.  Can’t I use ISO 639-2 directly?


No.  In the ISO 21090 specification, the “language” property is defined as a CS.  That means the data type is fixed to a single code system.  The code system used is IETF Language Tags, which is consistent with what the w3c uses for language in all of their specifications and encompasses all languages in all of the ISO 639 specifications plus many others (for example, country-specific dialects as well as additional language tags maintained by IANA.)

Question: Well, ok, but what about in the RIM for LanguageCommunication.code.  Can I send ISO 639-2 codes there?


Yes, though with a caveat.  LanguageCommunication.code is defined as a CD, meaning you can send multiple codes – one primary code and as many translations as you see fit.  You are free to send ISO 639-2 codes (the 3-character ones) or any other codes as a translation.  However, LanguageCommunication.code has a vocabulary assertion of the HumanLanguage concept domain, which is universally bound to a value set defined as “all codes from the ietf3066 code system”.  That means the primary code within the CD must be an IETF code.  So that gives you two options:

  1. Fill the root code with the appropriate IETF code – copying the ISO code most of the time and translating the 3-character code to the correct 2-character code for those 84 language codes found in ISO 639-1; or
  2. Omit the root code property and set the null flavor to “UNC” (unencoded), essentially declaring that you haven’t bothered to try translating the code you captured into the required code sytem.

And before you mention it, yes, the reference to IETF 3066 is a problem.  The actual code system name in the HL7 specification is “Tags for the Identification of Languages”, which is the correct name.  However the short name assigned was “ietf3066” and the description in the OID registry refers explicitly to the 3066 version.  This is an error, as IETF 3066 is a version of the IETF “Tags for the Identification of Language” code system and the OID is for the code system, not a particular version of it.  (There have actually been 4 versions so far – 1766, 3066, 4646 and 5646.)  We’ll try to get the short name and description corrected via the HL7 harmonization process

Question: But I don’t want to translate to 2-character codes and I don’t want to use a null flavor.  Can’t we just relax the universal binding?


We can’t relax the binding because the HumanLanguage concept domain is shared by both the ED.language property in the abstract data types specification (which ISO 21090 is based on) and the LanguageCommunication.code attribute.  The ED.language is a CS and so must remain universally bound.

In theory, we could split into two separate domains – one for data types and one for LanguageCommunication.code.  The second one could have a looser binding.  However, it’s hard to make a case for doing that.  There are several issues:

First, having two different bindings for essentially the same sort of information is just going to cause grief for implementers.  You could be faced with declaring what language the patient reads in one code system, but identifying the language of the documentation the patient’s supposed to read in a second code system.

Second, the IETF code system fully encompasses all languages covered by all the ISO 639-x code systems, plus thousands of others expressible using various sub-tags such as identifying country-specific dialects.  In the unlikely situation that you need a language that can’t be expressed using any of those, there’s even a syntax for sending local codes (and a mechanism for registering supplemental codes with IANA if you want to be more official).  So there should never be a situation where you can’t express your desired language using the IETF Language Tags code system.

Question: I don’t really care that I can express my languages in IETF.  I’ve already pre-adopted using ISO 639-2 and -3 in my v3 implementation and I don’t want to change.  Why are you putting constraints in place that prevent implementers from doing what they want to do?


Well, technically your implementation right now is non-conformant.  And implementers always have the right to be non-conformant.  HL7 doesn’t require anyone to follow any of its specifications.  So long as your communication partners are willing to do what you want to do, anything goes by site-specific agreement.

However, the standards process is about giving up a degree of implementation flexibility in exchange for greater interoperability.  By standardizing on a single set of codes for human language, we’re able to ensure interoperability across all systems.  Natively, those systems may use other code systems, but for communication purposes, they translate to the common code system so everyone can safely exchange information.

If the premise for loosening a standard was “we won’t require any system to translate data from their native syntax”, there’d be no standards at all.  Yes, translation and mapping requires extra effort (though a look-up table for 84 codes with direct 1..1 correspondence is pretty easy compared to a lot of the mapping effort needed in other areas.)  But that’s the price of interoperability.

Question: ISO 21090 namespace


I had a first view into the XML-Schema and Schematron of the ISO-21090 datatypes. I did not find any namespace information. In which namespace do these elements “live” in? Or are they in the “nonamespace”? If so, is there a specific reason for this?


This is called “chameleon namespacing“: the ISO datatypes are not in any particular namespace, but whatever is assigned to them where they are used. Quoting from the standard:

All elements  shall  be in some  namespace, and the namespace  shall  be defined in the conformance statements of information processing entities that claim conformance with  this International Standard. This International Standard  reserves the namespace “” for direct applications of these datatypes such as testing environments

HL7 uses them in the urn:hl7-org:v3 namespace. Other organisations can use them in whatever namespace they want.

Question: ISO 21090 Schema


The ISO 21090 PDF has a link to the HL7 website where the schema can be obtained: There are schema errors in the schema available at this link (the invariant rules reference HXIT members that have been renamed).

This was fixed on the editors’ resources website but HL7 now has the wrong version…

Also, in your HL7 video you said you moved the schematron rules to a separate file. This isn’t so.  Is there another schema where the rules have been taken out?


I have asked the HL7 webmaster to update the link above to the latest version. In the meantime, it’s available here. I did create a separate schematron file, but it hasn’t made it’s way onto the implementer’s resources page… probably my fault. You can get it here. There is no equivalent schema with no schematrons in it, though preparing it is not difficult (just a little time-consuming).


Technical Correction in Data Types R2 / ISO 21090

According to the data types, the PostalAddressUse Enumeration is based on the code system identified by the OID 2.16.840.1.113883.5.1012. But if you look up that OID in the OID registry (or the MIF, for insiders), you see that:

Retired as of the Novebmer 2008 Harmonization meeting, and replaced by 2.16.840.1.113883.5.1119 AddressUse.

This appears to be my error – I should’ve used the OID 2.16.840.1.113883.5.1119. We’ll have to see about issuing a technical correction.

Question: should you use Concept Ids or Description Ids in HL7 instances?


In my brief look at the standards, I can’t seem to find any requirement or convention in SNOMED or HL7 regarding the the use of Description IDs vs Concept IDs in messages or datasets.

Is there any preference or can Description IDs and Concept IDs be used interchangeably?


It is correct to use Concept Ids as the code in both HL7 v2 messages and CDA (and other v3 instances). Description Ids should not be used.


This is,unfortunately, not documented anywhere. It is implicit in the language used by the datatypes in both v2 and v3, though this is more obvious in v3,with the explicit focus on concepts. While its true that description ids also uniquely identify concepts, they additionally identify display strings which are duplicated by other fields in the appropriate types.