Gazelle ObjectsChecker

Abstract

Nonconformity of healthcare implementations to the medical standards has become a real source of troubles and loss of interoperability between systems. Healthcare documents frequently contain inconsistent requirements related to the standards they must conform to. Few standards and methodologies exist to deal with complex requirements, and often they are only dedicated to some specific kinds of healthcare standards, like CDA, HL7 and DICOM. The complexity of standards and their constant evolution have made difficult theimplementation of robust check methods and tools for healthcare documents.

References

you can refer to these links :

  1. paper of IHIC 2015 / presentation IHIC 2015
  2. paper of IHIC 2016 
  3. paper of IHIC 2017 / presentation of IHIC 2017
  4. paper of HEALTHINF 2014 : https://gazelle.ihe.net/files/HEALTHINF_2014_49_CR_2.pdf
  5. presentation of IHIC 2015 : https://gazelle.ihe.net/files/paper_ihic_presentation_0.pdf
  6. video presentation of IHIC 2015 : https://vimeo.com/119524890
  7. documentation of gazelle EVSClient validation : https://gazelle.ihe.net/content/cda-model-based-validation
  8. Blog in Ringholm : http://www.ringholm.com/column/HL7_CDA_Conformance_testing_tools_analysis.htm
  9. Eric Poiseau presentation in HL7 WGM of Paris, May 2015 : https://vimeo.com/127800129
  10. EVSClient : https://gazelle.ihe.net/EVSClient/
  11. eStandard Webinar, Nov 19, 2015, "Art-Decor and Gazelle Tools" : http://www.estandards-project.eu/index.cfm/tools/
  12. Presentation in Lisbon eHealth Week in December 2015 : https://gazelle.ihe.net/files/20151209-Art-decor_and_Gazelle_tools-Lisbon.pdf, here the context of the presentation.

Other links :

  1. HL7 Europe Newsletter http://www.hl7.eu/download/eun-05-2015.pdf
  2. Software Implementation of CDA http://wiki.hl7.org/index.php?title=Software_Implementation_of_CDA
  3. Requirements related to Validation of HL7 Templates Design for CDA : https://gazelle.ihe.net/files/cdatemplates_requirements_restriction.pdf
  4. Specification for HL7 Templates Converter Module https://gazelle.ihe.net/files/SpecificationsforHL7TemplatesConverterModule.pdf

 

Introduction

KEREVAL Health Lab, on behalf of IHE Europe,  has developed a tool to create validator for XML documents related to IHE specifications.
This validator tool can be used to validate any kind of XML document, especially CDA documents.
The aim of the tool is to transform assertions of standards into an UML models with constraints into it, and then create from this model Java code of validation of CDA documents, a documentation and unit testing of theses constraints.

Gazelle Objects Checker is not a CDA Authoring Tool. For authoring of CDA documents we recommend to use Art-Decor. Gazelle Objects Checker can be interfaced with authoring tools and import the definition of CDA document. 

 

Principle

Principle

The principle of this project is to extract assertions from CDA specifications and to insert them into a UML model. This model is then processed to generate validator, unit testing and documentation.

Purpose

Numerous regional and national healthcare initiatives are requiring eHealth applications to be tested for conformance to profiled standards. Meaningful Use (Melissa Markey, 2012) in the USA, Elga (Georg Duftschmid, 2009) in Austria or ASIP (ASIP, 2012a) in France illustrate the desire of healthcare organizations to build an infrastructure to share PHR. Most of those initiatives are profiling the HL7 CDA (HL7, 2005a) standards and/or the IHE XDS (IHE, 2012a) profiles specifications. CDA specifies how tostructure and code health records while XDS providesthe sharing mechanism of these records. Testing the conformance of healthcare solutions becomes an important issue in order to achieve the interoperability of the different components contributing to the sharing of such documents. Since both XD messages and CDA documents have an XML structure, we have developed a methodology that allows the conformance checking of XML documents in the context of hybrid healthcare standards. This methodology is based on UML modeling and MDA approach to describe the specifications. The main idea of this methodology is to inject constraints into UML models instead of executing xpath rules into XML documents. The result of this methodology is a report of validation generated automatically from the models describing the healthcare standards. Following a review of existing solutions for validating XML documents in the field of healthcare we will present our methodology and its evaluation based on its use on real life project like epSOS (Thorp, 2010) (epSOS, 2013).

Objectives

Our goal is to create a method that allows describing formally all requirements under an XML healthcare standard. This method should be generic and support inheritance between standards. The performance of this method shall be comparable to the performance of schematrons, and even better. This method shall provide a documentation and a coverage of the implemented requirements. The maintainability of the tools and models shall be better than schematrons maintainability. The method described here provide also generated unit testing for each implemented requirement, which is a huge advantage comparing to schematrons. The outcome of the proposed methodology is to provide a UML model and a set of methods used to convert it into documentation, validators and test procedures (Hans-Erik Erikson, 2004).

Standards inheritance :

We can point as example two standards that share the same basic parent standards, and extend them with more rules and requirements. These standards are the Swiss CDA Laboratory documents (eHealth Suisse, 2013), and the French CDA Laboratory documents (ASIP, 2012b). These two kinds of documents are based on the standards below, described on the figure 1. The two standards have a common base, and differ only on the top of the pyramid of standards. The method described here takes care of this kind of inheritance, and allows having shared models for shared standards.

Example

Principle of the method

Principle

The principle of the method that we propose for the validation of XML documents based on UML description, is the following :

  1. From medical standards like HL7, DICOM (Hongli Lin, 2010) and IHE standards, we extract all requirements, and we insert them into a specific UML model, which has a specific structure, that we will describe later. The UML model contains constraints written in OCL (Object Constraint Language) (OMG, 2012). The purpose of this language is to describe the relationship between elements of the UML model, which can not be simply described by diagrammatic notation. Each OCL constraint represents a requirement on a medical standard. OCL is a powerful language that permits many variants of constraints, like loops, search constraints, conditional constraints, etc. By our experience on more than 50 validators of IHE documents, OCL can generate the description of any kind of rules related to the model. The created UML model contains also the structure of the XML document. This structure allows linking the OCL constraint to its corresponding XML element.
  2. The OCL constraints are then processed to a programming language code like JAVA. In the industry, there are many processors of OCL. The most popular one is DresdenOCL, which is a library developed and maintained by students and scientists of the Software Technology Group at Dresden University of Technology (Birgit Demuth, 2009) (Birgit Demuth and Zschale, 2004).
  3. The UML models and the processed OCL constraints are then used by a UML model to text generator (M2T), to generate a specific validator based on the UML contents (OMG, 2008). There are many projects that present themselves as a UML model to text tool, the most popular one is Acceleo (OMG, 2008). Acceleo is not a classic generator of code from UML models, like EMF generator. To generate code you have to provide some M2T templates, which describe the generated text from the model. The generated text can be of any kind: html, java, or C++ for example. The idea was to generate java code that allows transforming the XML document to be validated from XML to JAVA instances, and then to validate these java instances by using the code generated by the OCL processor. The M2T method allows also to generate a documentation of the UML model, and unit tests for constraints written on OCL. Each module is described by its M2T templates.

 Healthcare requirements

Requirements on healthcare documents are generally written on human language, not a formal one. A major problem of the maintainability of schematrons was the fact that once the schematron written, we do not know which requirements are described on it, and which ones are not. We have no information about the coverage of rules written on schematrons, according to requirements from healthcare specifications. Many tools offer the possibility to list requirements, like TestLink for example. The one that we chose was an OASIS standard: taml (OASIS, 2011). This standard is a common structure for defining requirements. We used this standard and restricted its structure to better conform to the context of requirements on healthcare documents. This standard allows specifying the list of predicates, and for each predicate you can specify a list of tags which describe the predicate. We restricted the tags to: ’section’ and ’page’, which describe the section on the document and the page that refer to the requirement. Each predicate is defined by a unique identifier (/taml:testAssertion/@id), and each list of requirements is defined by a unique identifier: /testAssertionSet/common/normativeSource/target/@idscheme.

We restricted the OASIS taml standard by imposing that the common element of the taml:testAssertionSet shall be present, and this common element shall contain information about the healthcare document that we are processing, like the document name, version, source name, and a URI to the original document. These properties are included in the element taml:common/taml:refSourceItem. We forced the taml descriptor to have a unique identifier by adding the element taml:common:/taml:target. This is an example of taml assertion token from the taml document of CDA PADV specification:

<testAssertion id="CONF-5">
<predicate>The Pharmaceutical Advice section SHALL
contain code.</predicate>
    <prescription level="mandatory"/>
    <tag tname="Section">6.3</tag>
    <tag tname="Page">19</tag>
</testAssertion>

Each requirement is identified by a unique couple (target@idscheme , testAssertion@id). So we defined a stereotype applied on UML constraint elements, which describes the relationship between constraints and requirements. It contains two attributes: IDs and targetIDScheme, where IDs represents the list of ID on the taml document, and targetIDScheme is the identifier of the taml document.

Relationship between rules and requiremen

This correlation between constraints and requirements allows calculating the coverage of the validator according to the list of requirements, and then to identify requirements that are not implemented in the UML models.

UML models’ specification

Constraints’ specification

Each requirement extracted from the specifications of the XML document is translated into a UML constraint that contains an OpaqueExpression element with the attributes:

  • language: 'OCL'
  • body: the OCL constraint

The OCL constraint shall always have the result equals to true when applied to an UML instance specification. On healthcare standards, especially on HL7 ones, there are three kinds of rules: requirements, warning, and notes, which are specified by the keyword SHALL, SHOULD and MAY (Bradner, 1997). We specified a stereotype applied on UML constraint elements, which allows to specify if the constraint is an error, a warning, or a note. This stereotype is named ConstraintType.

 Stereotypes related to Constraint element

We defined also a stereotype to document a constraint when it is related to a valueSet: a dynamic list of values, that can be provided by a CTS or a SVS provider (IHE, 2010) (HL7, 2005b) (Heymans S, 2011). Each created constraint element shall be related to a UML class element.

Classes’ specification

There are two kinds of UML classes in this methodology: classes used to describe the content of the XML document, and classes used to apply a list of rules on a kind of XML element.

  1. Classes used to describe the content of the XML document: These classes, called in our specification StructureClass (SC), contain attributes with the same structure described in the schema of the XML document. The profile used to describe the relation between classes and the schema elements is ’Ecore’ profile (Dave Steinberg, 2008). From an XSD we can generate the UML model containing the description of the content of the XML document. The principal stereotypes used from this profile to describe the XML structure are: EPackage, Eclass, EEnum, EAttribute, and EReference. The generation of code binded to XML is based on these stereotypes. On this kind of classes, basic constraints can be included. If we are sure that a rule shall be applicable to any restriction of the standard, and it is not related to some specific context, we can add it directly on the class of description of the element.
  2. Classes used to apply rules on a kind of XML element: When we are on a special specification of a standard, or on an affinity domain which restrict the original standard, like for example epSOS CDA standard, we know that rules applied by these standards are not absolute, and we can not attach these rules directly to the class of description of the element. We defined the notion of ’package of constraints’. Each package of constraints contains a list of classes of constraints, and each class of constraints contains a list of constraints, has a generalization to the parent UML StructureClass or to another class of constraints, and has a stereotype that defines the kind of the class of constraints.
        We defined three kinds of stereotypes that describe the kind of a class of constraints:
  • TemplateSpec (TS): described by a (path, id). The list of constraints on this class is applied only when the value of the path on the XML instance has the same value as the id
  • ConstraintsSpec (CS): the list of constraints is applied automatically to any instance of the element described by the parent class
  • AdvancedTemplate (AT): defined by an OCL rule. The list of constraints on this kind of classes is applied only when the specified rule is verified on an instance of the parent class.

Stereotypes of classes of constraints

The UML classes that are described by these stereotypes are created manually in the UML model, in order to provide restrictions on the StructureClass (SC) classes, using UML constraints that describe requirements of the specifications. However, the StructureClass(SC) classes are created automatically from the XSD schema that describes the structure of the XML documents to be validated (R. Bhuvaneswari, 2012). This schema is generally provided by the standard specification.

The mixing of these kinds of classes of constraints can lead us to illogic situations, by generalization from one kind to another. So we defined some rules of inheritance between Template Spec (TS), ConstraintsSpec (CS), AdvancedTemplate (AT) and StructureClass (SC).

When a class of constraints generalizes another class of constraints, parent class rules are added to the child class rules, and the two lists of rules are executed only if we can execute the two lists in the same time. If we allow a TemplateSpec to inherit from another TemplateSpec, the rules of the child are executed only if the element tested on the XML document verifies the two paths of the parent and the child classes. If one of these paths is not verified, the rules are not executed, and here we have a problem because no error is reported to the designer of the XML document, showing that the XML element is missing a path, one or the other. For this reason, the table below specifies this kind of relationship between classes of constraints.

We also defined a stereotype to document classes of constraints, named DocumentationSpec. It allows specifying information about the standard which is the origin of constraints, and this provides a better documentation of the class of constraints when generating the documentation of the model of constraints.

M2T generation’s specification

XML binding

As explained on the principle of this method, as output we need to generate a code that allows transforming XML elements to object instances. In the implementation of this methodology, we chose JAXB as the API to bind XML to objects (McLaughlin, 2002). Oracle provides a tool, named xjc (McLaughlin, 2001), which allows to generate from an XML schema, java classes containing a full description of the XSD elements, based on JAXB annotations. The same functionality of generation of code containing a binding with XSD elements is done by a M2T template. Creating our own generator of java code gives us the possibility to add further methods and attributes, that are not generated by xjc. For example, we have introduced the ability to validate xpath constraints as a method of validation on the generated java code.

Validation code

The second kind of output from UML models is the classes of validation. This generated code is the combination of the OCL code transformed and processed by the OCL processor, and the generated code from a M2T template that links rules between themselves. The result of the execution of the generated code is a list of verifications. This list contains errors, warnings, notes and reports about processing of rules. For each package of classes of constraints, we generate a class of validation. This technique allows the reusability of the generated code, and simplifies the imbrication of standards. For example, in the laboratory domain described in figure 1, for each block in the two pyramids, we define a package of constraints. So for each of HL7 CDA, IHE PCC and XD-LAB, we define a unique package of validation. Then we define specific packages for CDA-FR, CDA-CH, and LAB-FR, LAB-CH. The validator is the combination of the common packages and the specific packages. The M2T templates generate code based on the visitor pattern: for each element on the object structure generated to describe the elements of the XML, we pass an instance of the package of validation, to a method generated from the template of generation of structured classes.

This method verifies the rules of the package on the current instances, and calls the same method on its attributes with the same package’s instance.

State principle of the visitor pattern applied on XML elements validation

Unit testing generation

The generation of unit tests is managed by a M2T template to facilitate the process of unit testing. This feature does not exist on the schematrons’ process. The generated code provides for each constraint, two tests: one OK and one KO. The principle of each test is to validate a whole document, and to verify if the result of the constraint is what it is supposed to be. The specification of the XML documents to be verified can not be done automatically; it is the role of the tester to provide them.

Templates coverage

As we define a structure of templates and advanced templates, the specified model of constraints provides an overview of the complexity of the XML document. This feature is very useful in CDA documents and in XDS metadatas analysis, as the specifications of these kinds of documents are based on templates structure. For example, in CDA documents, sections and entries have an attribute named templateId which is a unique identifier of the kind of the element, and it is referenced by the TemplateSpec stereotype (HL7, 2005a).

Documentation generation

Documentation generationThe defined stereotypes to document constraints and classes of constraints are used for the generation of the documentation. The generation is performed using a M2T template, with an html output. There are two kinds of documentation that can be generated:

  • the documentation of the structure of the XML document: this documentation is a description of the elements of the XML document. For each element we can document the cardinality, the type, the name, and the parent.
  • the documentation of classes of constraints: this documentation contains the relationship between constraints and classes, documentation of the kind of constraints and the kind of classes of constraints, and finally a documentation of the link between constraints and taml assertions.

Conclusions

We designed, Gazelle ObjectsChecker, a methodology of validation of XML documents on healthcare standards based on model based architecture. This methodology has allowed to remain with weaknesses of schematrons technology, especially problems of maintainability, reusability, unit testing, documentation and requirements coverage. It has also simplified the validation of pyramidical standards. The implementation of this methodology was done using open source tools, especially Topcased as editor, DresdenOCL as OCL processor, and Acceleo as M2T generator. The generated output code was on Java technology. The result of this implementation has covered the needs of developers and users of the generated validators. The use of this methodology to create validators for multiple kinds of healthcare standards in many domains like epSOS and IHE has proved the efficiency of this method: any kind of constraint can be expressed. And also, an important feature was, the model based validators are quicker than schematrons. Several improvements could be injected into this methodology and its implementation, like a self editor of the UML model, to simplify the creation and the management of classes and constraints. Also, the concept of stereotypes to describe classes and constraints can evolve to a meta-model that describes this set of stereotypes, and a use of GMF can improve the usability and minimize the risk of inconsistencies of UML models. Moreover, the model based validation of XML based healthcare standards could be adapted to other domains that use the XML technology.