Monthly Archives: March 2014

Office for the iPad, and Dropbox

I woke up to a lovely piece of news this morning: Office has finally been released for the iPad. Yay!

This is good because most of my customers/partners send me word documents, with comments and change tracking, and I’m expected to do a round of editing, and return it. Up to now, I couldn’t do that on the iPad – so I had to have my laptop on hand for that task. Now (hopefully) I don’t have to.

But the registration process took over a couple of hours to figure my way through – I didn’t want to purchase the subscription on my apple account via iTunes, as that’s personal, not work (tax difference). I thought I could buy an office business subscription, but after I created the account using the iPad Office App, I just couldn’t figure out how to connect that to a business account (Yay Microsoft). Then I had to figure out how to connect the iPad to the office 365 home premium I bought on line – a couple of mis-starts and it all worked out. But these things are  hard, because you often have to commit to purchase to find out if the purchase will buy the thing that you want.

But now I have a problem – I use Dropbox for all my work (unless it’s code, when it’s going to be in svn or github). And I have a lots of setup around dropbox, and apps connected up to dropbox. I don’t want to move 5GB of files off dropbox and lose all my dropbox related services. But Office for iPad doesn’t support dropbox – though it looks like the hooks are there. So I’m hoping Dropbox support will come real soon.

All this does make me wonder why people complain that interoperability is so far behind in the healthcare space. So far behind what, I ask?

#FHIR for Clinical Users

One of the outstanding issues for FHIR has been to make the specification more penetrable for clinical users – or, more precisely, for non-technical users. The framing of what FHIR is made in a technology setting, and if you aren’t familiar with the technologies, then it’s hard to know where to start. I committed to doing something about that, so here’s a short “FHIR for Clinical Users” introduction:

http://wiki.hl7.org/index.php?title=FHIR_for_Clinical_Users

This explains how the API works using a simple analogy. Feedback to continue to improve this is welcome. It will make it’s way into the next version of the specification once we start working on that.

I’d like to thank Josh Mandel, Heather Leslie, Tim Benson, David Hay and Lloyd Mackenzie for contributing to the document.

 

CDA: What information from the Entries has to go in the narrative?

Most CDA implementation guides, and many tutorials – including some of both I wrote myself – say something like this:

it is an absolute requirement of all CDA documents that the section narrative text completely capture all the information represented in the structured elements within the section

(emphasis added) This example is from Brian’s article I linked to in my previous post, but the same language exists all over the place.

Only, it’s not true. What it should say is, “the section narrative must capture all the clinically relevant information represented in the structured elements within the section”. Which raises the obvious question, ‘well, what’s clinically relevant’?

For comparison, this is the definition of the underlying field that includes the words:

Act.text SHOULD NOT be used for the sharing of computable information

That shouldn’t be understood to mean that you can’t put information that is computable in Act.text, but that it’s meant for human consumption, and you put in there what’s relevant for a human.

What is “Clinically Relevant”?

The rest of this post gives a set of suggestions about what is clinically relevant in the entries. But it’s really important to understand that this is all just suggestions. The decision has to be made by some clinically aware person who knows what the data means – don’t let programmers do it (and I see a lot of documents where the programmers have done it, and what they did doesn’t make sense to a clinician).

First, I’ll look at the entries, and then I’ll make comments about data type rendering below that.

Entry – nothing to display in the narrative

Entry Acts (Observation , etc) – each act should be shown in the narrative, except for RegionOfInterest which is purely technical, and ObservationMedia which is shown via a <renderMultimedia>. Where possible, I prefer for the narrative to be a table, and there should be one row in the table for each entry (you can even make that rule a schematron assertion, and it picks up lots of errors, believe me).  

classCode , moodCode, statusCode – these are critical to be in the narrative – but very often, they are implicit. For example, if the Section title is “Past Procedures”, and all the entries are procedures with moodCode = EVN, then there’s nothing to say in the narrative (and don’t, if you don’t have to, otherwise it’s just noise for already overwhelmed users). But if the section title is “procedures, past and planned”, then you’ll need to show the mood code for each procedure – and don’t just show the code for the moodCode – give it a meaningful human readable word (e.g. done or planned in this case)

effectiveTime – This is generally necessary to display. It’s rare that the timing of some event simply doesn’t matter – except for future activities that are known not to have happened (the effective time then is the time of booking, and whether to show this is a clinical decision based on the context of use)

id – generally I would recommend not showing these. But it’s a matter of clinical judgement when identifiers are likely to matter to someone reading the  document (when they have to get on the phone and say, look, about diagnostic report XXXX, can you…)

code – generally, this needs to be shown

negationInd – this always needs to be shown if it’s negative. The only exception I could think of is if every entry in the section is negated, and the section title says “failed Medication Administrations” or something like that

text – any time an entry has text, I’d very much expect that the text would be in the narrative – but it might not be quite the same. For instance, the text might be generated using a variation of the method by which the whole narrative was generated, so the information would be similar, but laid out differently. Alternatively, the text could be additional independent information from everything else, and then would be found verbatim in the narrative.

Other entry attributes – well, there’s just too many to list here, and beyond the common elements above, there’s not much simple common guidance I can write. Some things like Observation.value, and SubstanceAdministration.dosage are obvious things to put in the narrative, and other things like Act.priority code are unlikely to be in the narrative unless the act is yet future. Most other things lie somewhere between this. Clinical judgement is always required.

Participations

  • Author: probably not worth showing
  • Informant: probably not worth showing? (Wouldn’t it be nice to know that there would always be a way to “unpeel the onion” when it is appropriate)
  • Subject: Always must be shown if it’s different to the section subject, but this is extremely rare. (which reminds me – if the section subject differs to the document subject, that absolutely must be shown in the section narrative)
  • Performer: usually worth showing (as a single piece of text, the name of the person, or the org if there’s no person)
  • Consumable/Product: Show the name and/or code of the DrugOrOtherMaterial
  • Specimen: really, I’m not sure. If it’s a diagnostic report, it’s not usually shown in the narrative (or if it is, it’s in the title or implied in the test code). I haven’t seen specimen in other contexts
  • Participant: it depends on what type of participant and why. Sometimes, this is audit trail type stuff, and you shouldn’t show it. Other times, it’s critical information

ActRelationships

  • Reference – should be shown as a linkHtml if possible.
  • Precondition – don’t know.
  • ReferenceRange – usually shown, but the labs know what they want or have to do
  • Component (on organizer) – should always be shown

EntryRelationships

What to do with Entry Relationships all depends on what they say. These could be:

  • Entries in the narrative table in their own right (sometimes in place of the thing that contains them)
  • nested tables in the main table
  • a single line of text in the main table
  • ignored and not shown in the narrative

It’s a clinical judgement call. If you do show an entry relationship, pay attention to these elements:

  • typeCode / inversionInd – The meaning of the typeCode and inversionInd needs to be conveyed somehow. Very often, this is obvious in the context, and showing it is just noise. But sometimes it matters
  • sequenceNumber – any time this is present, it probably needs to be in the narrative
  • negationInd – see above

Data Type Guidance

  • nullFlavor: if the data type has a nullFlavor, then show some abbreviation for the nullFlavor. I use “–” for NI, n/a for NA, unk for UNK, or the code (the first three are the common ones). If the data type has additional properties as well as a nullFlavor, then generally these would be type /scope information, and something smart is required – e.g. <tel nullFalvor=”UNK” use=”H” value=”tel:’/> should be shown as “Home phone number: unk”.
  • BL / INT / REAL / ST: just show the value directly
  • ED: These shouldn’t be rendered directly. You should use a linkHtml or renderMultimedia to refer to this indirectly
  • CD/CE/CV: if you have original text, then show that. If not, it’ll have to be a display. In a few circumstances, where humans use the codes directly (billing codes!) it’s worth showing the codes directly
  • CS: if you have to show a CS, you should know why, and you should have a representation table that translates the code to something good for a human
  • II: if you have to show one of these (try and avoid it), and it has an extension, show the extension. if it’s just a UUID – don’t show it. Humans hate UUIDS. Just like OIDs
  • TEL: I use at template like this [use] [type] details, where use is a lookup from the use code, if there is one, type is phone or fax depending on the scheme in the value, and details is the rest of the value. if it’s a web url, I’ll drop the [type] and show the value is a linkHtml. But that doesn’t always suit (host systems might not know what to do with a mailto: url in a linkHtml)
  • EN: If you have to render this, use [use] [literal] where use is from the use code(s) and literal is the literal form defined in the abstract data types (the definition is obtuse, but the format is usually pretty good)
  • AD: same as EN
  • PQ: render as [value] [unit]. If you have a different human display unit to the UCUM unit, you can use that (or translate it to a human display. for instance, the UCUM unit uL, some people prefer µL and some mandate mcg)
  • RTO: Use the literal form (though use human units if they available)
  • TS: Depends on the culture. In Australia, we mandated dd-mmm-yyyy hh:nn:[ss][T] (no milliseconds). Other cultures will have their own rules
  • IVL<>: I use the literal form
  • GTS: good luck with this one… hopefully you have some system source text description as well as a GTS. I guess there’s always the literal form.

 

Concern on Allergies in CCDA

In my last post, I referred to an ongoing discussion on an HL7 forum:

On an HL7 mailing list, there’s a rather active discussion that is happening about a particular feature in CCDA (allergy lists). It turns out that one of the bits of the CCDA specification is somewhat obtuse (surprise!), and there’s more than one opinion on what it means and how it’s supposed to be used. I’ll probably end up posting on the specific subject when (if) there’s closure to the subject.

Well, the discussion seems to be coming to some closure (after some 100 emails). But rather than me say anything, I think Brian Weiss has a great summary:

Given that the “concern act” templates in C-CDA R1.1 can’t be used for “health concern tracking” in the broad sense, but at the same time given that they are required to be used with every problem or allergy observation in C-CDA, it is a bit challenging to understand how exactly they should be used and interpreted. That is the focus of this article.

I recommend anyone interested in CCDA reads the rest of Brian’s article.

More generally, I wonder how to connect Brian’s great advice – he’s got lots of it – with the implementers. Here’s what CCDA has to say on the subject:

This clinical statement act represents a concern relating to a patient’s allergies or adverse events. A concern is a term used when referring to patient’s problems that are related to one another. Observations of problems or other clinical statements captured at a point in time are wrapped in a Allergy Problem Act, or “”Concern”” act, which represents the ongoing process tracked over time. This outer Allergy Problem Act (representing the “”Concern””) can contain nested problem observations or other nested clinical statements relevant to the allergy concern

But what does that mean? how would you interpret this when producing or consuming CCDA documents? Why would an allergy have multiple observations – are they complementary or does the most recent replace the earlier ones? Implementers will made different decisions. There’s lots of cases like this in CCDA, but my experience is that implementers won’t stop to think that there might be advice to help them out. I posted quite a bit of stuff about the Australian PCEHR like this, and I know that most implementers never saw it until too late.

I guess all we can do is encourage implementers to look for advice, and in the case of CCDA, to read Brian’s work. Which is really the point of this blog post.

 

The importance of examples

On an HL7 mailing list, there’s a rather active discussion that is happening about a particular feature in CCDA (allergy lists). It turns out that one of the bits of the CCDA specification is somewhat obtuse (surprise!), and there’s more than one opinion on what it means and how it’s supposed to be used. I’ll probably end up posting on the specific subject when (if) there’s closure to the subject.

But it’s not the first time, in fact, it’s happened in the Australian PCEHR too – something that seems self evident to the authors of the specification, but actually turns out to have more than one interpretation when the CDA document is being populated (actually, it’s not about CDA, this can happen with any specification). So when you discover that, the natural question is, well, what have the existing implemented systems been doing about this? And very often the answer is… we don’t know, and we have no way to find out either.

Any big program that is integrating multiple systems should include the following in their integration testing / approval process:

  1. Integrating systems should have to produce several different documents, each corresponding to a pre-defined clinical case
  2. The documents should be manually compared against the defined case
  3. The instance examples should be posted in a repository available to all the implementers, along with the assessment notes (because approval is never a binary thing)
    • if the project is a national project, that means a public repository

It’s hard to state how important this is once in you are in the close out stages of the project. Want to know if some proposed operation is safe, and it depends on what feeder systems are doing? You can just go look. Want to know if someone has already done something wrong? you can go look (supposing, of course, that the test cases provide coverage, but often they do).

Examples: they’re really important for a specification, and they’re just as important for a project. If I was allowed to change only one thing about the PCEHR implementation project, I’d pick #3 (we already do 1 and 2).

Question: What’s a conformant #FHIR Server?

Question

Where are a concise set of requirements for a FHIR server? In other words, implementing RESTful API implies an architectural pattern for interactions, but where are the requirements for a FHIR server? How can we say we are a conformant FHIR server?

Answer

A conformant FHIR server is one that implements the API documented at http://hl7.org/implement/standards/fhir/http.html.

The API is a set of operations performed on a system, a resource type, or a resource. The minimum that a server must do is respond with a conformance statement when it is asked to. The conformance statement can say that the server doesn’t support any operations for any resource. This would be a conformant server, though not a very useful one (this might make sense if it’s doing per-user conformance statement, and the user has no access).

From this point – an empty server, the server can list any resource types it supports, and specify operations from the API as it chooses. A useful server will implement at least read and search for the resource types it supports; implementing any other operations is very much use-case dependent. A conformant server will implement all the operations it lists in the conformance statement, and it will list all the operations it performs in the conformance statement.

Question: #FHIR, complexity, and modeling

Question:

HL7 V3 is known for increasing complexity up to the point where people give up. The RIM seems not adequate enough for modeling the world of clinical information (see. Barry Smith: http://de.slideshare.net/BarrySmith3/hl7-january-2013).

Is FHIR meant to be a cure? I understand that FHIR it is about using a RESTful architectural style in communication of clinical resources, but the resources themselves need to be modeled appropriately. Complexity is not going to go away. Thus, FHIR appears to be another way to slice the elephant, the elephant being the world of clinical information and the need for semantic interoperability. Is there a promise for a better modeling of resources in FHIR?

Answer:

Well, this is not an easy question. The complexity we struggle with comes from several different sources:

  1. The inherent complexity in medicine, and the underlying biological processes
  2. The necessity to provide healthcare in the absence of clear information, and at the limits of human knowledge
  3. The wide variability in the processes around providing healthcare – including education and billing, both within and between countries
  4. Attempts to systematise information about all this to support computable information

Clearly, in FHIR, we could only attempt to reduce the complexity of the last point, and we have attempted to. We’ve also tried to limit the impact of the 3rd point in the way we handle extensions – we have to handle complexity, but we want to keep it in the right place.  Still, this only deals with point #3

These are the things we do to try to manage complexity properly:

  • We have a ruthless focus on what systems already do, partly because this is yardstick of the complexity that people know how to make work
  • We have a strong focus on testing, testing, and more testing – we don’t iterate our designs in the absence of real world implementation experience
  • We use the language of the domain in our resource designs, rather than a standardised language (e.g. RIM) so that people who actually use it can understand what they see quickly
  • We map to other representations, including underlying ontologies and grammars, to ensure that the design is as well based in formal definitions and logic (perhaps I should say that we aspire to do this, I think we have a long way to go yet)
  • We maintain our own ontology around how resources should be structured and broken up

Having said that, I don’t think any of this is a magic bullet, and FHIR is not trying to define a coherent information framework that allows for semantic computing – we’re simply defining pragmatic exchange formats based on today’s reality. I think that some of the problems that the RIM and ontologies are trying to solve, we’ve just punted them into the future in order to pursue a more practical course in the shorter term. For that reason, I try not to use the word “model” in association with FHIR, because we’re not “modeling healthcare” in the same sense that the RIM is trying to do, or that Barry is criticising it for.

BTW, having finally linked to Barry’s work from my blog, I think that Barry is largely mistaken in his criticisms of the RIM. It’s not that I think it’s beyond criticism (I have a number of my own, including this, though I’m not such a fan that I’d pose with it), but that I think he misunderstands its intent, what it does do, and what it’s not good for, and nor does he understand the boundaries within which it works).

I actually think that modeling healthcare properly is beyond our abilities at this time – too complex for normal people to deal with is too simple to express the things specialists know that computers need to say.

I expect a vigorous debate in the comments on this post… but anyone who comments, please keep my 3 laws of interoperability (It’s all about the people, you can’t get rid of complexity, and you can’t have it all) and my note about semantic interoperability (and #2) in mind.

HL7 Standards and rules for handling errors

It’s a pretty common question:

What’s the required behavior if a (message | document | resource) is not valid? Do you have to validate it? what are you supposed to do?

That question can – and has been – asked about HL7 v2 messaging, v3 messaging, CDA documents, and now the FHIR API.

The answer is that HL7 itself doesn’t say. There’s several reasons for that:

  • HL7’s main focus is to define what is and isn’t valid, and how implementation guides and trading partners can define what is and isn’t valid in their contexts
  • HL7 generally doesn’t even say what your obligations are when the content is valid either – that’s nearly always delegated to implementation guides and trading partner agreements (such as, say, XDS, but I can’t even remember XDS making many rules about this)
  • We often discuss this – the problem is that there’s no right rule around what action to take. A system is allowed to choose to accept invalid content, and most choose to do that (some won’t report an error no matter what you send them). Others, on the other hand, reject the content outright. All that HL7 says is that you can choose to reject the content
  • In fact, you’re allowed to reject the content even if it’s valid with regard to the specification, because of other reasons (e.g. ward not known trying to admit a patient)
  • We believe in Postel’s law:

Be conservative in what you do, be liberal in what you accept from others

  • HL7 doesn’t know what you can accept – so it doesn’t try to make rules about that.

So what’s good advice?

Validation

I don’t think that there’s any single advice on this. Whether you should validate instances in practice depends on whether you can deal with the consequences of failure, and whether you can’t deal with the consequences of not validating. Here’s an incident to illustrate this point:

We set up our new (HL7 v2) ADT feed to validate all incoming messages, and tested the interface thoroughly during pre-production. There were no problems, and it all looked good. However, as soon as we put the system into production mode, we started getting messages rejected because the clerical staff were inputting data that was not valid. On investigation we found that the testing had used a set of data based on the formally documented practices of the institution, but the clerical staff had informal policies around date entry that we didn’t test. Rejected messages left the applications out of sync, and caused much worse issues. Rather than try to change the clerical staff, we ended up turning validation off

That was a real case. The question here is, what happens to the illegal dates now that you no longer accept them? If you hadn’t been validating, what would have happened? In fact, you have to validate the data, the only question is, do you validate everything up-front, or only what you need as you handle it?

Now consider the Australian PCEHR. It’s an XDS based system, and every submitted document is subjected to the full array of schema and schematron validation that we are able to devise. We do this because downstream processing of the documents – which happens to a limited degree (at the moment) – cannot be proven safe if the documents might be invalid. And we continuously add further validation around identified safety issues (at least, where we can, though many of the safety issues are not things that automated checks can do anything about).

But it has it’s problems too – because of Australian privacy laws, it’s really very difficult for vendors, let alone 3rd parties, to investigate incidents on site in production systems. The PCEHR has additional rules built around privacy and security which make it tougher (e.g. accidentally sharing the patient identifier with someone who is not providing healthcare services for the patient is a criminal offence).  So in practice, when a document is rejected by the pcEHR, it’s the user’s problem. And the end-user has no idea what the problem is, or what to do about it (schematron errors are hard enough for programmers…).

So validation is a vexed question with no right answer. You have to do it to a degree, but you (or your users) will suffer for it too.

Handling Errors

You have to be able reject content. You might choose to handle failed content in line (let the sender know) or out of line (put it in a queue for a system administrator). Both actions are thoroughly wrong and unsafe. And the unsafest thing about either is that they’ll both be ignored in practice – just another process failure in the degenerate process called “healthcare”.

When you reject content, provide both as specific and verbose message as you can, loaded with context, details, paths, reasons, etc – that’s for the person who debugs it. And also provide a human readable version for the users, something they could use to describe the problem to a patient (or even a manager).

If you administer systems: it’s really good to be right on top of this and follow up every error, because they’re all serious – but my experience is that administrative teams are swamped under a stream of messages where the signal to noise ratio is low, but the real problems are beyond addressing anyway.

 

Question: Should value sets include nullFlavor Values?

Note – mostly, the questions I post on my blog are implementer focused questions. This is different – it’s from a discussion between myself and a HL7 vocab work group co-chair. It’s about a particularly unclear and relatively distant corner case in the CDA specification space, but I thought I’d post it here to get the answer into google. If you don’t even understand the question, don’t worry…

Question:

For a CDA value set, my understanding is that one should only include the domain specific values, and that the null flavors would not appear in the value set. Is this true?

Answer:

This is a little understood corner of our specifications. As a data types guy, I’d say (and have said previously):

We expect people to use the value set machinery to constrain the nullFlavors that can be used for any particular element

This is because nullFlavors are part the value set of the element – so if you were specifying a value set for an element, you’d therefore specify a value set that included the nullFlavors that were allowed. But I see that except for CD etc, that would have to be an adjunct to the value domain, expressed as a value set.You might say, for example, this element can have the values of 0.0-5.0 mg/dL, or NI or UNK, but nothing else. Clearly, you’ll have to the two parts of that differently, and the intention was that you’d use a value set for the nullFlavor part. 

Only, to my knowledge, no one has ever done this – and my own v3 / cda tools have not done this; you can specify nullFlavors in a value set, but my tools – along with everyone else’s, as far as I know, would then expect that they’d appear like this: 

  <code code="NI" codeSystem="2.16.840.1.113883.5.1008"/>

instead of

  <code nullFlavor="NI"/>

(2.16.840.1.113883.5.1008 is the OID for the NullFlavor code system). Concerning these 2 representations, release 2 of the v3 data types says:

The general implication of this is that in a CD or descendant (usually CS), when the code for a nullFlavor is carried in the code/codeSystem (code = “NI” and codeSystem = “2.16.840.1.113883.5.1008”), the CD itself is not null. The CD is only null when its nullFlavor carries this code

That text was written to underline the special meaning of the nullFlavor attribute, but it does serve to say that the first variant above is different to the second, and therefore, it has no defined meaning, and shouldn’t be used.

So my tools would handle the appearance of nullFlavors in the value set wrongly, and I think everyone’s would too, and I’m sure that we never say anything about this in our methodology documentation (e.g. core principles). In fact, looking through the v3 standards, I found:

In many cases, a CD is created from a value set – either a code/code system pair is chosen from a valueSet, or one is not chosen and the CD has the exceptional value of NullFlavor.OTH

Well, I have no idea what that would mean if you put OTH in the value set!

Well, where does this leave us?

  • In practice, including nullFlavors in a value set won’t achieve anything useful
  • We haven’t really said quite what should happen if you do
  • So there’s no real solution for controlling nullFlavors properly

Note that in v2, and FHIR, nullFlavors are not included in the base data type – instead they are included in the value set explicitly, and represented like any other code

Update: There’s another factor to consider:  the existing value sets are designed on the basis that nullFlavors are still allowed, and so that they don’t go in the value set. So deciding now that valuesets define which nullFlavors are allowed invalidates all the existing value sets. So really, to use value sets, we need to define a new binding path. Else you use constraints, as Sarah demonstrates in her comments below.

 

Testing FHIR Data types for equality

One of the more difficult parts of building an object library is deciding what to do about equality checking. There’s several common operations where you want to do equality checking:

  • Are Objects A and B the same instance?
  • Does object A have exactly the same values as object B?
  • Do Objects A and B refer to the same thing?
  • I am merging lists in an update – is A in a List L of objects? 
  • I am keeping a map (dictionary) – is A in the map?

Even within most of these choices, there’s more subtle issues, such as

  • Do you just want to check the simple properties of this object, or do you want to check all the properties of all the objects (shallow vs deep)
  • Do you have to worry about circularity if you’re doing deep?
  • Does order matter in lists?
  • Do empty properties mean match or no match?

There’s no simple one size fits all, so we don’t override equals() methods etc in the reference implementations.

However, there’s no doubt that with anything but the primitive data types, there’s some properties that are identifying properties, and others that are describing, and you’d treat them differently for testing whether they refer to the same thing. Here’s an analysis of the different data types:

Primitive Types

  • boolean, integer, decimal, base64Binary, string, date: value = identity, and the comparison is simple
  • uri: in principle, the comparison is simple, but some URIs include the capability to include access properties (e.g. FTP username/password) or descriptive labels (tel:), and there may be a need to canonicalize the url (e.g. http parameter order only matters within a set of parameters with the same name)
  • dateTime: this is complicated the optional presence of a timezone. Obviously, if two date times are the same, and have the same timezone, then they are the same. If they have different timezones, then they might still be considered the same if they refer to the same instant when converted to UTC (probably should be considered the same, but it does depend on context). If there’s no timezones, then the correct action depends on what can be assumed about timezones based on the local context of use.

Attachment

The url and data properties are identifying. If the bits and/or the address are the same, then this must refer to the same record. The properties contentType, language, size, and hash are derived properties from the content, though they may or may not be provided – but you’d ignore them when comparing the objects. The title property is descriptive, and should also be ignored, though might might vary between different instances.

Coding

The system and value properties are identifying. Both most be present, or else you can’t tell whether they are the same (which shows that comparison for sameness is not a binary outcome – you might be sure that they are, sure that they’re not, or unsure). Whether version counts towards the comparison depends on the semantics of the underlying code system, and could be very complicated indeed.

The display property might vary (if the system defines multiple displays, like SNOMED CT), and shouldn’t count to the comparison. The primary property is about how a code was used, and doesn’t count to whether two codings refer to the same content, but may count towards comparison of items that contain a Coding (see next). The same logic applies to the valueSet.

CodeableConcept

This is a most difficult test – the rules are highly contextual, and also dependent on the underlying code system. Generally, you could assume that if two CodeableConcepts have a matching coding – e.g. both of them have multiple codings, and at least one of them is the same – then they are same concept. However, if you were merging a list of CodeableConcepts, then you’d probably want to check that all the codings match (order doesn’t matter), and depending on the user case, that primary was the same too.

If two CodeableConcepts only have text, then they would be the same if the text matches (ignoring whitespace, and case, and possibly some grammatical characters).

Quantity

Logically, two quantity values are the same if they have the same value and units. Except that there’s several issues around this:

  • What to do if the precisions of the values vary (e.g. 1.3 vs 1.299) depends on the context, and there’s no general answer
  • If a code for the units is provided, then you compare based on that. Subsumption or equivalence testing is required (rather than straight string matching)
  • If there’s no code for the units, then you can compare the human readable unit, but this is fragile (ug vs µg)
  • For extra points, you can do comparison based on canonical units if a ucum code is provided
  • If a comparator is present, it can’t be ignored. is <4 the same as <3? There’s a set of cases here where you might be unsure whether they are the same

Range

At last – something simple! Two ranges are the same if the low and high properties are the same (including if they are absent).

Ratio

This is also simple – Two ratios are the same if the low and high properties are the same (including if they are absent). Note that 1:2 is not the same as 2:4 (if that were logically true, then a straight quantity should have been used instead).

Period

In principle, this is also simple: Two periods are the same if the low and high properties are the same (including if they are absent). However the issues around precision of the dates are tricky, as is comparing timezones (see notes above)

SampledData

Two SampledData values would the same if their origin, period, dimensions and data match, after adjusting for the factor. lowerLimit and upperLimit probably matter too.

There’s not a lot of call for matching SampledData values – whether the match would be driven by the surrounding data whereever they are used.

Identifier

The system and value properties are identifying. Both most be present, or else you can’t tell whether they are the same.

The use, label, period and assigner properties are all descriptive, and don’t count toward comparison of whether this is the same identifier.

Human Name

Given the huge variation in how names are used, and how systems track how names are used, there’s no one size fits all method. However the following is a good base to start from:

  • The family name list must match (order dependent)
  • The given name list must match (order dependent) though you might terminate the match checking when the shorter list ends (or if the short list only has one name)
  • Prefix and Suffix usually don’t count
  • Period doesn’t matter, nor does use
  • text should be ignored, unless there’s no name parts, in which case you match on that. For advanced points, you can compare text and parts, but that’s hard and potentially risky

Address

Address is similar to name, but the order of the parts doesn’t matter:

  • The line list must match. Order is not important
  • The city, state, and zip must match. In the absence of a city and zip, there’s no match. State doesn’t matter if the country doesn’t have states
  • There may be a default country, based on either system location, or patient information
  • Period doesn’t matter, nor does use

  • text should be ignored, unless there’s no name parts, in which case you match on that. For advanced points, you can compare text and parts, but that’s hard and potentially risky

Contact

The system and value properties are identifying. Both most be present, or else you can’t tell whether they are the same. (well, in the absence of system, you might be able to).

The use and period properties are descriptive, and don’t count toward comparison of whether this is the same contact.

Schedule

Technically, two schedules are the same if they describe the same set of times, but since they could describe these differently, that’s a pretty complicated test. In practice, there’s not a lot of call for sophistication here, and it suffice to simply do a comparison of the sub-properties. Notionally, the order of events would not matter, but these would usually be in order anyway

Resources

The rules for each resource differs. If there’s enough interest, I might do this for some of the resources (comments please)

Question

Is this kind of knowledge worth embedding in the resource definitions as meta-data in future versions?