Design by Constraint – not as useful as people think (#1)

Design by Constraint is  a framework pattern used in a number of specifications across healthcare, including a number of high profile standards. It’s a great way to deal with the problem of ensuring semantic consistency in a large information design, but the engineering outcomes are rather less than optimal.

Note: This is the first post in a series (probably 3-4 long) that are really a single document explaining design by constraint.

Welcome to “Design by Constraint”

Here are some examples of “Design by Constraint” in healthcare:

Because the specification, implementation and engineering differ so widely in these different cases, it’s hard to recognise that the common pattern exists – but it does, and the same issues surface one way or another in each.

Note: For all I know, similar patterns exist outside healthcare. But I’ve not particularly investigated. However DITA and Docbook are in this space. Perhaps I’ll explore that in a later post.

Reference Model

The fundamental notion of design by constraint is easy to summarise:

Define a common information model called the “Reference Model”, that has general applicability, and then describe constraints on it’s use for particular use-cases.

The notion here is that though the use-cases are quite different, the fact that each is connected to an underlying reference model is a great help to implementation because it imposes a common framework along with semantic consistency.

For this reason, the pattern is sometimes called the “reference model pattern”. But sometimes the reference model isn’t known by those terms (“RIM” in HL7 v3, and BRIDG/CDA/VHIM are not called the “reference model” to my knowledge, but that’s exactly how they are treated), so I use the name “Design by Constraint”.

Advantages of Design by Constraint

Before I go on to describe the method in detail from a UML perspective, and detail the attendant problems it creates, I need to underscore the fact that this pattern is a very elegant and powerful way to solve a particularly difficult problem in information model design: semantic consistency.

If a model is designed by a single person, this might not be a problem (if the person is agile, and anal-retentive – an unusual combination). But as the problem size scales, and time passes, the number of participants rises steeply, and the problem of consistency becomes very difficult to manage.

Design by Constraint solves this problem well (though not perfectly).

How Design by Constraint works

  1. Design a common class model that can be used for everything
  2. Define meta-patterns associated with it, and rigorously enforce their use
  3. Define a language for expressing constraints on the model
  4. Use the language to define constraints for many different use-cases
  5. (optional) Define a transform to convert the constrained use-case model into a new class model

The class model is almost always a UML class diagram. The meta-patterns are usually defined in human language. The language is usually a hybrid language assembled from several different underlying common approaches. Although 5 is labelled as optional, the more time passes, the more likely it is that it will happen. I’ll discuss why this is below.

There’s actually a step #6 that follows logically from the steps above:

  • Implementation chaos ensues

The point of these posts is to describe why this is. For the rest of these posts, I’m going to use a rather simple example: a model that describes the commercial arrangements around a restaurant with the following characteristics:

  • —Restaurants are associated with corporations (many to many)
  • —Restaurants are associated with people  (many to many)
  • —People are associated with corporations  (many to many)

Here’s a simple UML model for this:

This reference model is intended as a PIM: you create object models in your code, or XML documents and schemas, or database schemas, using your standard approach – whether this is by hand or by some sort of tool assisted approach (of which there are many).

Some quick doco to get us through the rest of the discussion:

  • Restaurants have a name, an address, a number of seats, and a flag for whether they are a franchise operation or not. In addition, they must be associated with one or more people and maybe some corporations. (This model does not track the nature of the association)
  • Corporations have a name, and maybe a taxId. Each corporation must be associated with at least one person, and maybe some restaurants
  • Persons have an id, a name, a code for sex (gender), and a taxId. In addition, a person has a role – a code that defines what kind of person they are

Note: this simple model is far from adequate for any particular application except demonstrating my point.

Constraining the Reference Model

This general reference model is too general for many use cases. For almost all use-cases, in fact, a particular subset of this model is used. But the subset needs to be described, because as soon as the reference model becomes large enough to be functionally useful, there are many different possible ways to represent a use case, and any one using the reference (to make or consume information statements) needs to agree on how it’s done.

Describing the subset is the same as applying a set of constraints to the model. These constraints can be represented using human language constraints, but more structured representations are more amenable to being leveraged by tooling support. Let’s take, as an example, the context of a family business restaurant. Here’s a semi-structured english language statement of constraints:
Business: “FamilyBusinessRestaurant”

  • franchise property fixed to null
  • Address: 3-4 lines, order matters
  • No Corporation
  • 1Person called “accountant”, 2-6 called “family”

—Person: “Accountant”

  • role is “ACC”
  • taxId is renamed “businessId”
  • No Corporation, and sex & id are fixed to null

—Person: “Employee”

  • role is “EMP”
  • id & taxId fixed to null
  • No Corporation

It’s worth noting here that the question of what’s a valid constraint differs slightly across the various frameworks. Is it ok to rename fields? Is it ok to fix order? Which data types can be substituted for other data types?

Semantics in the Constraints

One important question is whether the constraining models can contain semantics that are only represented in the constraining model, or whether all the semantics need to come from the underlying reference model. A different way to state this issue is to ask, is it necessary to know and understand the constraining model in order to understand the instance of data, or is it just enough to know the reference model?

In principle, the fundamental concept of “design by constraint” is that you don’t need to understand the constraining model. This allows you to leverage your knowledge of the reference model across multiple contexts without having to understand a particular fine-grained use case. An example of this is with CDA – there is only one document, and you can write a single document viewer that is appropriate to work with every single (proper) CDA document that has ever been written, but there is a profusion of CDA implementation guides describing exactly how information should be structured for particular use cases.

So in principle, the constraining models shouldn’t define semantics that aren’t explicit in the instance of reference model data. Specifically, to use that example above, the constraining model can’t simply say “the first person in a restaurant is the accountant”. Instead, it must say, “the first person has a role code of “ACC”, which means accountant, and this is how you know that the person is an accountant”.

The problem with this is that it’s really difficult to ensure that the constraining model doesn’t introduce new semantics that are not in the underlying data. For example, in the example above, the following semantics are not explicit in the data:

  • That a restaurant is a “Family Business Restaurant”. (? whether this matters in some or all contexts, but it is not explicit)
  • There’s a subtle interaction in the simple stated constraints between family and employee. There’s an assumption in the constraints that employee = family member, but this might not always be true. Whatever the case, the “family” part is not explicit in the data

(This is more obvious in the data examples that follow)

A general rule is that it’s extremely difficult to ensure that a constraining model doesn’t contain semantics that are not explicit in the data instance. In particular, only thorough human review can determine that the constraining model doesn’t contain implicit semantics.

Note that openEHR varies slightly from the simple pattern here: archetypes are constrained models on the openEHR reference model, but in the very abstract parts of the openEHR reference model (elements), they are allowed and required to define semantics that are only in the archetype. The corollary of this is that you cannot understand the data without knowing the archetype. On the other hand, templates, which are also constraining models, are not allowed to introduce new semantics, so that they can be ignored. They also have the same subtle problems mentioned above.

Next post: Structured representations of constraint models.


  1. Thomas Beale says:

    The 1-5 list is a reasonable explanation, although I am not sure what step 2 is about.

    We should be careful to understand what a ‘subset’ by constraint is in step 4: in openEHR ADL, it is a subset of the possible instance space of the model, not a new constrained version of the model. This is a fundamental point. n HL7v3, step 4 is about producing constrained models (indeed the RIM is often called a model generator), not specifying instance sub-spaces. That’s why RMIMs and HMDs cause the creation of new XSDs.

    Step 6 doesn’t logically follow, at least not in openEHR – but extra effort has to be made to get to implementation. Maybe it is a given in HL7v3…

    With respect to ‘adding semantics’ (or not), instances of the openEHR reference model are comprehensible to computers on their own. They can be displayed, compared and queried. However, designing the queries requires access to the archetypes. The role of archetypes is not to constrain the reference model, it is to define the semantics for particular instance configurations of the reference model. Thus, one ELEMENT may be a systolic pressure, another a diastolic pressure and so on. Similarly, making inferences based on terminology requires access to the terminology, not just the data.

    Having the same reference model for all data greatly simplifies back-end systems, and avoids (largely) the huge costs associated with data migration of millions of EHRs.

  2. Grahame says:

    In both HL7 v2 and openEHR, step #4 is about the same thing: defining subsets of the instance space. HL7 initially focused on step #5 as the logical reason for #4, where as openEHR did not not. But now openEHR has step #5, and the HL7 community is becoming more aware of the gap between #4 and #5. So v3 and openEHR gradually start to look more and more alike in this respect.

    Step #6 is discussed in post #4, where I explain how openEHR walks around the potential chaos (and kudos to you for that, btw). Further comments there please.

    With regard to adding semantics, I went back and modified my comments several times, trying to explain. OpenEHR differs from v3 in that in v3, all the semantics are explicit in the reference model, and *should* only be constrained, where as in openEHR archetypes, reference model semantics should be only be constrained, but new semantics at different levels can be added. A harder question for openEHR: how do you *not* add semantics in templates (it’s my impression that it’s wrong to do that)

  3. Thomas Beale says:

    I always thought that the driving intention of HL7v3 was to generate message schemas, just be a different means to v2? I.e. to generate a model-per-message. That’s what most of the use of v3 messages in the UK and Canada seem to be…

    On semantics: there are clearly ‘semantics’ all the way up the model stack. The RM class Observation (in either openEHR or HL7) expresses some semantic commitments – e.g. it is clearly about things in past time, not the future.

    An archetype for blood gases expresses further commitments, e.g. that some Element instance contains a value for PaO2 (arterial oxygen). Conversely, apps can ‘understand’ the data at the semantic level corresponding to what they were built with. If the app knows only about the reference model, then it can correctly draw a graph of complex values in time of an openEHR Observation of PaO2, PaCO2 etc. If it has access to the Blood gases archetype, it can potentially do something smart with the PaCO2 value, e.g. infer that the patient has impaired respiratory function and needs to be given oxygen (a Computerised guideline would enable such an inference).

    Templates in ADL 1.5 can in fact further refine meaning of an entity. E.g. an Element, defined by at0005|diagnosis| to e.g. at0005.1|initial diagnosis| in some template designed for some specific kind of ward or clinic; they might even add some data nodes specific to the local use. This could be viewed as ‘adding semantics’, but the question is: is it semantics that anyone outside the locality of use of the template cares about?

  4. Grahame says:

    1. yes that was the intention – the focus. But that’s not all
    2. what you say about observation is not true about HL7 – it can be in the future. And I suspect it’s only partially true about openEHR. Can you have an instruction for an observation?
    3. yes. It’s hard to define clearly, are these semantics safe

  5. Lloyd McKenzie says:

    I don’t think the “intent” of v3 was a schema per message, that’s just been the easiest course to follow for many. Many implementers were only interested in the schemas. In practice with v3 the instance can be driven from the RIM level, the RMIM level, the template level, or anything along the path. Particular ITSs (or standards like CDA) tell you what your wire format is going to look like.

  6. Thomas Beale says:

    Grahame, re: your point 2 above: Observation in openEHR is only in the past. If you want to say that some observation should be done in the future, that is recorded with an Instruction. The observation, when it is performed is recorded as an Observation, and maybe an Action as well, if the details of observing are of use. Both of the latter are in the past (i.e. they are about things that already happened, and their detailed models reflect this). These are quite basic ontological commitments. I remember now that Observation in HL7 could have an intentional mood, which is really one of the confusions with HL7; it is no longer an ‘observation’ it is now an ‘intention to observe’ which is a different thing.

  7. Thomas Beale says:

    By the way, the acronym DBC or DbC normally means ‘design by contract’, as defined by Bertrand Meyer. I like the phrase ‘design by constraint’, but it will probably confuse some people if you use ‘DbC’ to indicate that.

  8. Grahame says:

    DbC: doesn’t appear in the posts, only the comments.
    Instructions that are observations? I assumed instructions would include observations. No? ok
    HL7 Intentional observations: there’s no confusion. You might argue it’s less that ideal, but it’s not confusing, since that’s what mood code does.

  9. John Madden says:

    Wonderful post, Grahame. Can you comment on what you think the relation is, if any, between “design by constraint” and the well-known anti-pattern in object design known as “derivation by restriction”?

    • Grahame Grieve says:

      Thanks John. I don’t know that derivation by restriction is a well known anti-pattern. I couldn’t find any direct hits on google for it. Got a reference? I’m not really sure what derivation by restriction in object design is. Tom Beale defines FOPP, where you define properties that aren’t very relevant for some instantiations of the class. But he needs a much tighter definition of that, because there’s always properties that are ignored in some usages. I think you’d have be pretty extreme to insist that all properties have to be relevant to all instances. On the other hand, I think that these notions are closely related. You can’t derive by restricting a property to an illegal value, only to an innocuous value. I think that design by constraint makes a virtue of this design pattern – fixing properties to innocuous values. Whether this is bad depends on how you use it. There is a perspective that says a UML class diagram is just a design by constraint on the metamodel – which suggests to me that design by constraint is neither good nor bad, the problem is what you do with it.

  10. Thomas Beale says:

    For reference, my main post that talks about the problems with derivation by restriction is here:

    This contains links to some other useful resources on the topic.

    Some of the key problems with derivation by restriction (or what I called ‘subtractive’ information modelling) in an OO framework:

    * the classes are very hard to reason about and therefore program, meaning they are error-prone and may cause serious bugs in behaviour and data;

    * many instances in the data may contain numerous Void fields which have to be dealt with in some way, and which unnecessarily complicate databases;

    * it breaks the extensible nature of normal object models, which requires properties to be added going down the inheritance hierarchy, and to the most specific class for which it makes sense;

    * it makes models and software brittle, since applying this principle in the extreme requires all possible properties of all descendant types to be included in the base class. This can never be successfully done, since noone can predict all future subtypes needed in a model.

    The FOPP (Fundamental Ontology Property Principle) probably does need tightening up, but I think it is basically right, and something HL7 needs to take into account.

    See the final problem in the list above for why derivation by restriction can never really work in OO information modelling.

Leave a Reply

Your email address will not be published. Required fields are marked *

question razz sad evil exclaim smile redface biggrin surprised eek confused cool lol mad twisted rolleyes wink idea arrow neutral cry mrgreen


%d bloggers like this: