Question: How to store a CDA document in a relational database

A question (by the ask me a question link above): I am trying to design a RDBMS to store the CDA documents, I have couple of questions

  1. If I create tables using the HMD, is that a good approach to do that?
  2. How to store the data types eg. CD in database, for example create a table CS and link it to Act or any table where it is used.

Well, taking these questions in turn:

Should I create the tables using the HMD?

Umm, well, you can. Sort of. But the HMD is abstract. That means that it’s an in principle statement of the content of a CDA document, not an in practice statement. You have to account, in addition to what’s in the HMD, for nullFlavor, realmCode and templateId, though only the latter is commonly encountered. Also, there’s the mapping from the abstract data type defined in the HMD to the actual way the XML works (one day I’ll have time to do a post about the abstract / ITS data type mapping). And you have to bridge between these models in your parser. But you can do this – it’s just work. But is that the best way?

For me, no, I’d use the schema. It’s just more tangible and final, and there’s too many pitfalls for a non-hl7 guru to fall into creating the tables off the HMD. (well, to tell the truth, I used the MIF, which is kind of like the HMD, but very definitely an insider only way).

How to store the data types

Well, if you do it from the schema, that answers your question – whatever way the schema -> RDMS schemata tool does the conversion, and don’t fiddle with it. But if you insist on doing it any other way, then you’ll become an expert in the abstract/its mapping. As I said, it can be done. I personally kill the recursion in the CD data type, and normalise it – a table for code systems, a table for codesystem references, a table for concepts (original text + code system references). But whether this is good for you depends on what you’re tying to do.

More generally, the process of splitting a CDA document into a fully structured relational database – why would you do that? It’s hard to query the individual parts without looking at the context in which they appear (you kind of query for clinical statement parts as the smallest level of granularity, and think about things like moodCode). And are you going to break the narrative into database parts too?

Personally, I’d keep the documents whole, and build specific secondary indexes from them based on the particular pieces I want to do queries against.





  1. Rene Spronk says:

    In the whitepaper we purposefully stayed away from discussing this type of persistence.

    If you only ever have to support CDA you could define a database scheme for it – the devil (as Grahame points out) is in persisting the data types.

    CDA also lends itself rather well to the use of a hybrid database model: partly structured ER for those things you need in searches, partly XML-blobs. Sections and/or individual entries are obvious candidates to be stored as XML-blobs.

  2. Reuben Daniels says:

    I agree with Grahame – store the document whole and index on certain fields (in particular header fields).

    I imagine the recent RDBMS enhancements for storing and querying XML documents in an RDMS (see may be helpful in this regard.

  3. Dhiren says:

    Thank you Grahame!,it is really helpful.

Leave a Reply

Your email address will not be published. Required fields are marked *

question razz sad evil exclaim smile redface biggrin surprised eek confused cool lol mad twisted rolleyes wink idea arrow neutral cry mrgreen


%d bloggers like this: