New CDA Stylesheet

From Keith’s blog:

A few weeks ago, Josh Mandel identified problems with the sample CDA Stylesheet that is released with C-CDA Release 1.1 and several other editions of the CDA Standard and various implementation guides. There are some 30 variants of the stylesheet in various HL7 standards and guides.  A patched version of the file has been created and is now available from HL7′s GForge site here.

Note for Australian users: if you use the NEHTA authored stylesheet, this release doesn’t affect you.

#FHIR: Subscribing to a server

This question has come up several times lately, so I thought I’d show how to subscribe to a server.

Below is pseudo code for subscribing to a server. It assumes that you have [base] – the address of the server, and a local store for settings (could be an ini file or a database), a client that can do http, and a bundle class that matches the semantics of a Bundle (e.g. an atom feed or it’s JSON equivalent), and load from the http outcome. The logic is pretty simple:

  • get the latest _history from the server
  • follow the next links till they run out, aggregating the content
  • store the feed.updated from the first response to use for next time you do it
procedure Subscribe(base)
  client = connectToServer(base) // and maybe get the conformance statement
  loop
    sleep(however long you want)
    feed = downloadUpdates(client)
    foreach entry in feed.entries *working backwards* // see note below
      process entry however you want
  until it's time to stop
procedure downloadUpdates(client)
  lasttime = store.getProp('LastTime')
  master = new Bundle()
  next = null
  i = 1
  do
    log('Downloading Updates (page('+i+')')
    if (next != null)
      feed = new Bundle(client.fetch(next))
    else if (lasttime != null)
      feed = new Bundle(client.fetch(base+'_history?_since='+lasttime))
    else
      feed = new Bundle(client.fetch(base+'_history')
    master.entries.addAll(feed.entries)
    if (next == null)
      store.setProp('lastTime', feed.updated)
    next = feed.link['next']
    i++
  until next == null
  return master

Notes:

  • for bonus marks, you could add a _count parameter and ask for as many resources per page from the server as you can get – that reduces the load on the network a little, and the lag time some
  • The outcome of the fetching loop is a single bundle that lists all the updates on the server since the last time in reverse chronological order. Whether you have to process in the correct order depends on what you are doing
  • If you want to see a real implementation of this, look at the class org.hl7.fhir.sentinel.SentinelWorker in the FHIR SVN (see under build/implementations/java)

#FHIR FAQs and Knowledge Base articles

HL7 has published a set of FAQs and knowledge base articles about FHIR, that cover questions about the following aspects of the FHIR specification:

  • Scope and Relationships
  • Specification
  • Architecture
  • Tooling and Support
  • Resource Design
  • Implementation Approach
  • Codes and Terminology
  • Implementation Details
  • Domain Questions
  • Security

The questions were mined from the FHIR email list, the implementer’s skype channel, the various blogs about FHIR, and the questions that get asked of the project leads by email and in person.

This is a very good source of information about FHIR, and highly recommended for FHIR users of all kinds to read. Note that access to this information resource is a member benefit – you can only get to it if you have a member login on the HL7 website.

 

 

Question: HL7 Open source libraries

Question:

I work on EHR, and I want to use HL7 to communicate with different systems in a medical environment, for that I found apis (Hapi, nHapi, javasig, Everest,) and I don’t know what is the best and what are the selection criteria

Answer:

Well, what are your selection criteria? The first question is whether you are doing HL7 v2 or v3. Almost certainly, it’s v2. What language are you using? (or want to use).

Here’s some information about the libraries:

  • HAPI: an open-source, object-oriented HL7 2.x parser for Java – it also includes a number of other utilities for using HL7 v2
  • NHAPI: NHapi is a port of the original project HAPI. NHapi allows Microsoft .NET developers to easily use an HL7 2.x object model
  • Javasig: A library of code written to process V3 messages based on the MIF definitions
  • Everest:  Designed to ease the creation, formatting, and transmission of HL7v3 structures with remote systems. Supports CDA R2, and canadian messages

My evaluations:

  • HAPI – a solid well established community, with runs on the board and reliable code. I’d be happy to use this
  • nHAPI – as far as I can tell, not well established, and given it’s for an HIE, I question the long term viability of the community. Unlike HAPI, I never hear of people actually using this
  • Javasig: this is dead – note that the only link I found was to a web archive. You’d have to be desperate to try to use it, though the maintainers might get interested if you did
  • Everest: this has a community of users in Canada, but I’ve not heard of any use outside there. I’m not sure to what degree Mohawk are supporting it (I’ll ask)

You should consider one of the paid libraries – given the amount of hours you’re going to invest in the work (1000s, I bet), a few thousand for a software library and related tools is peanuts. There’s a lot of good choices there (btw, I have one of my own, which doesn’t cost that much).

 

Questions about Questionnaire

One of the themes of the connectathon that will be held in a few weeks in Phoenix is using the questionnaire resource. That’s generated lots of attention to it, and a number of questions about how to use a questionnaire to drive a form to collect answers from a person. For the purposes of the connectathon, the answers are here:

At the upcoming May HL7 Workgroup Meeting in Phoenix, we will organize the 6th FHIR Connectathon. Track 2 will focus on Questionnaires, so both client and server developers can test their skills at creating and supporting the Questionnaire resource

SQL Injection attacks against HL7 interfaces

One of the questions that several people have asked me in the discussions triggered by the report of CDA vulnerabilities is whether there is any concern about SQL injection attacks in CDA usage, or other HL7 exchange protocols.

The answer is both yes and no.

Healthcare Applications are subject to SQL injection attacks.

An SQL injection attack is always possible anywhere that a programmer takes input from a user, and constructs an SQL statement by appending it into a string, like this:

connection.sql = “select * from a-table where key = ‘”+value_user_entered+”‘”;

There is any number of equivalents in programming languages, all variations on this theme. If the value the user entered includes the character ‘, then this will cause an error. It may also cause additional sql to run, as memorably captured in this XKCD cartoon:

 Exploits of a Mum

Exploits of a Mum

There’s several ways to prevent SQL injection attacks:

  • check (sanitise) the inputs so that SQL can’t be embedded in the string
  • use parameterised SQL statements
  • escape the SQL parameters (connection.sql = “select * from a-table where key = ‘”+sql_escape(value_user_entered)+”‘”);
  • ensure that the underlying db layer will only execute a single sql command

Doing all of these is best, and the last is the least reliable (cause it’s the easiest). Because all of these actions are not as convenient as the vulnerable approach, SQL injection attacks continue to be depressingly common. And that’s on web sites.

My experience is that healthcare interfaces tend to be written very differently to websites that are going to run in public. They are not written expecting hostile users – by and large, they don’t, either. And they are always written to a budget, based on estimates that don’t include any allowance for security concerns. That’s right – the only time I have seen security feature in interface costings is for the PCEHR.

So I assume that the default state is that healthcare interfaces are potentially subject to SQL injection attacks due to programmer efficiency (anyone who calls it ‘laziness’ is someone who’s never had to pay for programming to be done).

Healthcare Applications are not so subject to SQL injection attacks.

However, in practice, it turns out that it’s not quite such a concern as I just made it show. That’s for several reasons.

I was specifically asked about CDA documents and SQL injection. Well, CDA contents are rarely processed into databases. However the XDS headers that are used when CDA documents are exchanged are often processed into databases, and these are usually extracted from the CDA contents. So any XDS implementation is a concern. The PCEHR is an XDS implementation, though programmers never needed to wonder whether it sanitized it’s inputs. A side lesson from the PCEHR story is that name fields are not generally subject to SQL injection attacks, since it’s not too long before a name with an apostrophe trips them over if they are (and that’s true on more than just healthcare interfaces).

Really, the context where SQL Injection attacks are most likely is in the context of an HL7 v2 interface.

Some healthcare applications sanitize their inputs. But validation has it’s challenges.

However the real reason that healthcare applications are not so subject to SQL injection attacks is operational. An SQL injection attack is iterative. I think this is one of the best demonstrations of how to do one:

This an ASP.NET error and other frameworks have similar paradigms but the important thing is that the error message is disclosing information about the internal implementation, namely that there is no column called “x”. Why is this important? It’s fundamentally important because once you establish that an app is leaking SQL exceptions

The key thing here is that a general pre-requisite for enabling SQL injection attack is to leak interesting information in errors, and then the attacker can exploit that iteratively. Well, we’re safe there, because interfaces rarely return coherent errors ;-)

Actually, that’s not really true. It’s easy, if you can craft inputs to the interface, and capture outputs, to iterate, even if you don’t get good information. Mostly, typical users don’t get such access; they can enter exploit strings into an application, but the error messages they might generate downstream will go into some log somewhere.

That means that the interface error logs need protecting, as does access to the interfaces (in fact, this is the most common security approach for internal institutional interfaces – to restrict the IP addresses from which they can be connected to)

Generally, if you have access to the system logs and the interfaces, you’re very likely to have access to the databases anyway. Hence, SQL injection attacks aren’t such a problem.

But really, that’s pretty cold comfort, specially given that many attacks are made by insiders with time and existing access on their side. I’m sure that it’s only a matter of time till there’s a real world exploit from SQL injection.

Security cases for EHR systems

Well, security is the flavor of the week. And one thing we can say for sure is that many application authors and many healthcare users do not care about security on the grounds that malicious behaviour is not expected behaviour. One example that sticks in my mind is one of the major teaching hospitals in Australia that constantly had a few keys on the keyboard wear out early: the doctors had discovered that the password “963.” was valid, and could be entered by running your finger down the numeric keypad, so they all used this password.

So here’s some examples of real malicious behaviour:

Retrospectively Altering Patient Notes

From Doctors in the Dock:

A Queensland-based GP retrospectively changed his notes to falsely claim that a patient refused to go to hospital – then submitted those notes as evidence in an inquest, has been found to have committed professional misconduct and been suspended from practising medicine for 12 months including six months plus another six if he breached conditions within three years – and banned him from conducting most surgical procedures except under supervision for at least a year.

This is very similar to a case here in Melbourne more than a decade ago, where a doctor missed a pathology report that indicated cancer was re-occurring in a patient in remission. By the time the patient returned for another check-up, it was too late, and she died a few weeks later. The doctor paid an IT student to go into the database and alter the report to support his attempt to place blame on the pathology service. Unfortunately for him, the student missed the audit trail, and the lab was able to show that it wasn’t their fault. (I couldn’t find any web references to it, but it was big news at the time).

Both these involve highly motivated insider attacks. As does the next one, which is my own personal story.

Altering Diagnostic Reports

Late one evening, in maybe about 2001, I had a phone call from the owner of the company I used to work, and he told me that the clinical records application for which I was lead programmer had distributed some laboratory reports with the wrong patients data on them. This, of course, would be a significant disaster and we had considerable defense in depth against this occurring. But since it had, the customer – a major teaching hospital laboratory service – had a crisis of confidence in the system, and I was to drop everything and make my way immediately to their site, where I would be shown the erroneous reports.

Indeed, I was shown a set of reports – photocopies of originals, which had been collected out of a research trial record set, where what was printed on the paper did not match what was currently in the system. For nearly 48 hours, I investigated – prowled the history records, the system logs, the databases, the various different code bases, did joint code reviews, anything. I hardly slept… and I came up with nothing: I had no idea how this could be. We had other people in the hospital doing spot reviews – nothing. Eventually I made my way to the professor running the trial from which the records came, and summarised my findings: nothing. Fail. As I was leaving his office, he asked me casually, “had I considered fraud”? Well, no…

Once I was awake again, I sat and looked afresh at the reports, and I noticed that all the wrong data on the set of reports I had were all internally shuffled. There was no data from other reports. So then I asked for information about the trial: it was a multi-center prospective interventional study of an anti-hyperlipideamic drug. And it wasn’t double-blind, because the side-effects were too obvious to ignore. Once I had the patient list, and I compared real and reported results, I had a very clear pattern: the fake result showed that patients on the drug showed a decrease in cholesterol readings, and patients that were on placebo didn’t. The real results in the lab system showed no significant change in either direction for the cohort, but significant changes up and down for individual patients). For other centers in the trial, there was no significant effect of the drug at the cohort level.

So there evidence was clear: someone who knew which patient was on drug or placebo had been altering the results to meet their own expectations of a successful treatment (btw, this medication is a successful one, in routine clinical use now, so maybe it was the cohort, the protocol or dosage, but it wasn’t effective in this study). How were the reports being altered? Well, the process was that they’d print the reports off the system, and then fax them to the central site coordinating the trial for collation and storage. Obviously, then, the altering happened between printing and faxing, and once I got to this point, I was able to show faint crease marks on the faxes, and even pixel offsets, where the various reports had literally been cut and pasted together.

I was enormously relieved for myself and my company, since that meant that we were off the hook, but I felt sorry for the other researchers involved in the trial, since the professor in charge cancelled the whole trial (inevitable, given that the protocol had been violated by an over eager but dishonest trial nurse).

What lessons are there to be learnt from all this? I know this a tiny case size (3), but:

  • Many/most attacks on a medical record system will come from insiders, who may be very highly motivated indeed, and will typically have legitimate access to the system
  • In all cases, the audit trail was critical. Products need a good, solid, reliable, and well armoured audit log. Digital signatures where the vendor holds the private key are a good idea
  • Digital signatures on content and other IT controls can easily be subverted by going with manual work arounds. Until paper goes away, this will continue to be a challenge

If any readers have other cases of actual real world attacks on medical record systems, please contribute them in the comments, or to me directly. I’m happy to publish them anonymously

Further Analysis of CDA vulnerabilities

This is a follow up to my previous post about the CDA associated vulnerabilities, based on what’s been learnt and what questions have been asked.

Vulnerability #1: Unsanitized nonXMLBody/text/reference/@value can execute JavaScript

PCEHR status: current approved software doesn’t do this. Current schematrons don’t allow this. CDA documents that systems receive from the PCEHR will not include nonXmlBody.

General status: CDA documents that systems get from anywhere else might. In fact, you might get HTML to display from multiple sources, and how sure are you that the source and the channel can’t be hacked? This doesn’t have to involve CDA either – AS 4700.1 includes a way to put XHTML in a v2 segment, and there’s any number of other trading arrangements I’ve seen that exchange HTML.

So what can you do?

  • Scan incoming html to prevent active content in the HTML. (schematrons, or use a modified schema)
  • don’t view incoming html in the clinical system – use the user’s standard sandbox (e.g. the browsers)
  • change the protocol to not exchange raw html directly

Yeah, I know that this advice is wildly impractical. The privilege of the system architect is to balance between what has to be done, and what can’t be done ;-)

FHIR note: FHIR exchanges raw HTML like this. We said – for this reason exactly – that no active content is allowed. We’re going to release tightened up schema and schematron, and the public test servers are tightening up on this.

Vulnerability #2: Unsanitized table/@onmouseover can execute JavaScript

PCEHR status: these documents cannot be uploaded to the pcEHR, are not contained in the PCEHR, and the usual PCEHR stylesheet is unaffected anyway.

General status: CDA documents that you get from anywhere else might include these attributes. If the system isn’t using the PCEHR stylesheet, then it might be susceptible.  Note: this also need not involve CDA. Anytime you get an XML format that will be transformed to HTML for display, there might be ways to craft the input XML to produce active content – though I don’t know of any other Australian standard that works this way in the healthcare space

So what can you do?

Vulnerability #3: External references, including images

PCEHR status: There’s no approved usage of external references in linkHtml or referenceMultimedia, but use in hand written narrative can’t be ruled out. Displaying systems must ensure that this is safe. There will be further policy decisions with regard to the traceability that external references provide.

General Status: any content that you get from other systems may include images or references to content held on external servers, whether CDA, HTML, HL7 v2, or even PDF. If you are careless with the way you present the view, significant information might leak to the external server, up to and including the users password, or a system level authorization token, or the user’s identity. And no matter how careful you are, you cannot prevent some information leaking to the server – the users network address, and the URL of the image/reference. A malicious system could use this to track the use of the document it authored – but on the other hand, this may also be appropriate behaviour.

So what can you do?

  • Never put authentication information in the URL that is used to initiate display of clinical information (e.g. internally as a application library)
  • Never let the display library make the network call automatically – make the request in the application directly, and control the information that goes onto the wire explicitly
  • If the user clicks an external link, make it come up in their standard browser (using the ShellExec on windows or equivalent), so whatever happens doesn’t happen where the code has access to the clinical systems
  • The user should be warned about the difference between known safe and unknown content – but be careful, don’t nag them (as the legal people will want, every time; but soon the users click the warning by reflex, and won’t even know what it says)

Final note: this is being marketed as a CDA exploit. But it’s an exploit related to the ubiquity of HTML and controls, and it’s going to be more and more common…

Update – General mitigating / risk minimization approaches

John Moehrke points out that there’s a series of general foundations that all applications should be using, which mitigate the likelihood of problems, and or the damage that can be caused.

Authentication

Always know who the other party (parties) your code is communicating with, establish the identity well, and ensure communications with them are secure. If the party is a user using the application directly, then securing communications to them isn’t hard – then the focus is on login. But my experience is that systems overlook authenticating other systems that they communicate with, even if they encrypt the communications – which makes the encryption largely wasted (see “man in the middle“). Authenticating your trading partners properly makes it much harder to introduce malicious content (and is the foundation on which the PCEHR responses above rest on). Note, though, the steady drum of media complaints about the administrative consequences of properly authenticating the systems the PCEHR interacts with – properly authenticating systems is an administrative burden, which is why it’s often not done.

Authorization

A fundamental part of application design is properly managed authorization, and to do so throughout the application. For instance, don’t assume that you can enforce all proper access control by managing what widgets are visible or enabled in the UI; eventually additional paths to execute functionality will need to be provided, in order to support some kind of workflow integration/management. Making the privileges explicit in operational code is much safer. And means that rogue code running the UI context doesn’t have unrestricted access to the system (though a hard break like between client/server is required to really make that work)

Audit logging

Log everything. With good metadata. Then, when there is belief that the system is penetrated, you can actually know whether it is or not. Make sure that the security on the system logs is particularly strong (no point keeping them, but making it easy for the malicious code to delete them). If nothing else, this will help trace an attacker, and prevent them from making the same attack again because no one can figure out what they did

Note: none of this is healthcare specific. It’s all just standard application design, but my experience is that a lot of systems in healthcare aren’t particularly armored against assault because it just doesn’t happen much. But it’s still a real concern.

CDA Security Issues and implications for FHIR

Overnight, Josh Mandel posted several security issues with regard to CDA:

This blog post describes security issues that have affected well-known 2014 Certified EHRs. Please note that I’ve already shared this information privately with the Web-based EHR vendors I could identify, and I’ve waited until they were able to investigate the issues and (if needed) repair their systems.

Josh identified 3 issues:

  1. Unsanitized nonXMLBody/text/reference/@value can execute JavaScript
  2. Unsanitized table/@onmouseover can execute JavaScript
  3. Unsanitized observationMedia/value/reference/@value can leak state via HTTP Referer headers

So how do these relate to FHIR?

  1. The nonXMLbody attack as described by Josh is an accident in the stylesheet. The real problem is that a nonXMlBody can contain unsanitized HTML of any type, and there’s no way to show it faithfully and securely.Impact for FHIR: none. FHIR already includes native HTML, and active content is not allowed
  2. This and lots of other attacks are possible – though illegal – in FHIR.Impact for FHIR: we need to tighten up the base schema, the reference implementations to reject this content (and also document that this is a wonderful attack vector, and systems must validate the resources)
  3. This is by the far the hardest of the issues to deal with; as an outright attack, any system processing resources must ensure that http headers – including the http referer header – don’t leak when the image is retrieved. But there’s a more pernicious issue – if you author a CDA document, or a resource, you can insert a uniquely identified image reference in the source that means you get notified every time someone reads the document – kind of a doubleclick.net for health records. This is a different kind of attack again.Protecting against the second kind of abuse is only really possible if you whitelist the set of servers that are allowed to host images you will retrieve. My suggested whitelist: empty. Don’t allow external references  at all. In practice, that’s probably too draconianImpact for FHIR: provide advice around this issue. We can’t make it illegal to provide external references, but all implementations should start at that place

Whitelist of allowed HTML elements and attributes in FHIR

I am updating the FHIR xhtml schema, the java and pascal reference implementations, and my server, so that they only accept html elements with these names:

  • p
  • br
  • div
  • h1
  • h2
  • h3
  • h4
  • h5
  • h6
  • a
  • span
  • b
  • em
  • i
  • strong
  • small
  • big
  • tt
  • small
  • dfn
  • q
  • var
  • abbr
  • acronym
  • cite
  • blockquote
  • hr
  • ul
  • ol
  • li
  • dl
  • dt
  • dd
  • pre
  • table
  • caption
  • colgroup
  • col
  • thead
  • tr
  • tfoot
  • th
  • td
  • code
  • samp

The following attributes will be allowed:

  • a.href
  • a.name
  • *.title
  • *.style
  • *.class
  • *.id
  • *.colspan (td.colspan, th.colspan)
  • img.src

Have I missed anything?

CDA Use in the PCEHR: Lessons learned

I wrote an article for the latest edition of Pulse IT (page 53) called “CDA Use in the PCEHR: Lessons learned”:

One of the key foundations of the PCEHR is that the CDA (Clinical Document Architecture) is used for all the clinical documents that are part of the PCEHR. This article describes the lessons learned from using CDA for the PCEHR.

Here’s a summary of my lessons learned:

  • When using CDA (or anything else) make the documentation easy to read and navigate, do not assume prior knowledge, and make it as short and flat as possible
  • CDA is both too simple, and too complex. Adoption requires expertise, and policies and tools to leverage that expertise as much as possible
  • The presence of both Narrative and Data means that you can do much better checking of the implementations. However it also means that you need to
  • CDA specifications need to be specific about how the clinical narrative should be presented, as this is the most important part of the document
  • the CDA Narrative/Data pattern allows for interoperability in the presence of poor agreement about the underlying data;  whether this is a good thing depends on your perspective
  • The existence of the narrative/data pattern means that a thorough conformance testing framework is required to ensure quality
  • The implementation community in Australia still has a long way to go before we have mastered the exchange of codes from terminologies
  • Syntax is less important than content. Interoperability can only fully meet business goals when there is substantial business alignment

The conclusion is pretty simple: we’ve got a long way to go yet, at lots of levels. I suspect that some of the issues are going to burn other programs too.

I’m posting this here to serve as the place for discussion about what I wrote. Btw, if you’re making comments, please take note of this disclaimer in the article:

This article is not evaluating the PCEHR program itself, nor even how the PCEHR program used CDA, but just about what was learned about CDA.

btw, I am always happy to contribute to Pulse IT. It’s a wonderful resource for our community here in Australia – Simon’s done a great job.