Rule-based XML validation

Rule-based XML validation

This advanced validation workflow component, Rule-based XML validation, can validate XML according to rules.

 

The rules for rule-based validation can be used to validate documents in XML format.

 

A suggestion for implementation can be found here.

 

The component has a single parameter:

 

NG2WorkflowRuleBasedXMLValidation0001

 

The validation rules must be setup in a schematron file (.sch) in the validation section of the Library.

 

When you insert this component, then matching conditions are also inserted as below:

 

NG2Schematron0002

 

An example of a rule based validation can be found here.

 

You can then insert workflow component in the valid and invalid subtrees to setup what should happen depending on the validation, and report the result of the validation with either of these workflow components:

 

1.Rule based validation to log.

2.Rule based validation report to text attachment.

3.Rule based validation report to XML attachment.

 

 

Validation rules

The rules must be described in the the standard ISO Schematron format.

For the Schematron files to be accessible in InterFormNG2 workflows, they must be uploaded to the library under "Validation rules" and have the file extension .sch.

 

The basic structure of a Schematron file is this:

 

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<schema xmlns="http://purl.oclc.org/dsdl/schematron">

 <pattern>

   <rule context="XPATH">

     <assert test="XPATH">ERROR_DESCRIPTION</assert>

   </rule>

 </pattern>

</schema>

 

You can add as many patterns as you like. The purpose of a pattern is simply to group a set of rules. Each pattern can have multiple rules and each rule can have multiple asserts.

 

Let us try to validate this XML document:

 

<?xml version="1.0" encoding="UTF-8"?>

<Persons>

 <Person Title="Mr">

   <Name>John Doe</Name>

   <Gender>Male</Gender>

   <CustomerId>1000</CustomerId>

   <Email>jd@example.com</Email>

 </Person>

 <Person Title="Mr">

   <Name>Michael Smith</Name>

   <Gender>Male</Gender>

   <CustomerId>1001</CustomerId>

   <Initials>JS</Initials>

   <Email>js@example.com</Email>

 </Person>

 <Person Title="Mrs">

   <Name>Jane Doe</Name>

   <Gender>Female</Gender>

   <CustomerId>1099</CustomerId>

   <Initials>JD</Initials>

 </Person>

</Persons>

 

When creating a rule, the context attribute must be an XPath expression that identifies the node-set that we want to validate.

To validate every person element, the context must be "Persons/Person", like this:

 

<rule context="Persons/Person">

</rule>

 

Now we can add the rules for Person elements. In each assert, the test attribute must be an XPath expression that must be true when the person element is valid. The XPath expressions can use any XPath v2.0 functions that are valid in XSL. The text for the assert is the error message that should be displayed in the report, if the XPath expression evaluates to false.

Common rule patterns

 

   Validate existence of an attribute. To check if an attribute exists, do:

   test="@ATTRIBUTE-NAME"

   Validate existence of an element. To check if an element exists, do:

   test="ELEMENT-NAME"

   To validate the number of characters in a string, use the XPath function "string-length" in a logic expression.

   Since < is a reserved character in XML, less than (<) can be written as "lt" and "less than or equal" (<=) can be written as "le".

   For instance to check if the text in an element is 10 characters or less.

   test="string-length(ELEMENT-NAME) le 10"

   To validate if the contents of an element is a numeric value, use the XPath function "number":

   test="number(ELEMENT-NAME)"

   To validate the boundaries of a numeric value, use a logic expression like this:

   test="ELEMENT-NAME >= 1000"

 

Example

 

The below is an example of an entire ruleset for the Persons XML example above:

 

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<schema xmlns="http://purl.oclc.org/dsdl/schematron">

 <pattern>

   <rule context="Persons/Person">

     <assert test="@Title">The element Person must have a Title attribute</assert>

     <assert test="number(CustomerId)">The element CustomerId must be numeric</assert>

     <assert test="number(CustomerId) >= 1000">The element CustomerId must be at least 1000</assert>

     <assert test="@Title='Mr' or @Title='Mrs' or @Title='Ms'">Title must be Mr, Mrs or Ms</assert>

     <assert test="string-length(Initials) le 3">Initials must be no more than 3 characters</assert>

     <assert test="not(Email) or contains(Email,'@')">Email address must contain a @</assert>

   </rule>

   <rule context="Persons">

     <assert test="count(Person) > 0">The document must contain at least one person</assert>

   </rule>

 </pattern>

</schema>

 

Tool support

If you need tool support to help with the authoring of Schematron files, some editors are available, for instance OxygenXML: https://www.oxygenxml.com/xml_editor.html

 

Error report

When a Schematron validation is executed, a report in XML format is generated. The report contains all of the failed test cases and display the defined error message for that test.

 

This is an example of a failed validation of the Persons XML file.

 

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl"

                       xmlns:iso="http://purl.oclc.org/dsdl/schematron"

                       xmlns:schold="http://www.ascc.net/xml/schematron"

                       xmlns:xhtml="http://www.w3.org/1999/xhtml"

                       xmlns:xs="http://www.w3.org/2001/XMLSchema"

                       schemaVersion=""

                       title="">

 <svrl:active-pattern document=""/>

 <svrl:fired-rule context="Persons"/>

 <svrl:fired-rule context="Persons/Person"/>

 <svrl:fired-rule context="Persons/Person"/>

 <svrl:failed-assert test="number(CustomerId)" location="/Persons/Person[2]">

   <svrl:text>The element CustomerId must be numeric</svrl:text>

 </svrl:failed-assert>

 <svrl:failed-assert test="number(CustomerId) ge 1000" location="/Persons/Person[2]">

   <svrl:text>The element CustomerId must be at least 1000</svrl:text>

 </svrl:failed-assert>

 <svrl:failed-assert test="@Title='Mr' or @Title='Mrs' or @Title='Ms'"

                     location="/Persons/Person[2]">

   <svrl:text>Title must be Mr, Mrs or Ms</svrl:text>

 </svrl:failed-assert>

 <svrl:failed-assert test="not(Email) or contains(Email,'@')" location="/Persons/Person[2]">

   <svrl:text>Email address must contain a @</svrl:text>

 </svrl:failed-assert>

 <svrl:fired-rule context="Persons/Person"/>

 <svrl:failed-assert test="@Title" location="/Persons/Person[3]">

   <svrl:text>The element Person must have a Title attribute</svrl:text>

 </svrl:failed-assert>

 <svrl:failed-assert test="number(CustomerId) ge 1000" location="/Persons/Person[3]">

   <svrl:text>The element CustomerId must be at least 1000</svrl:text>

 </svrl:failed-assert>

 <svrl:failed-assert test="@Title='Mr' or @Title='Mrs' or @Title='Ms'"

                     location="/Persons/Person[3]">

   <svrl:text>Title must be Mr, Mrs or Ms</svrl:text>

 </svrl:failed-assert>

 <svrl:failed-assert test="string-length(Initials) le 3" location="/Persons/Person[3]">

   <svrl:text>Initials must be no more than 3 characters</svrl:text>

 </svrl:failed-assert>

</svrl:schematron-output>

    • Related Articles

    • Rule based validation example

      If you want to do a schematron validation, then you first need to insert the Rule based XML validation component: A simple schemation validation file could be this: <schema xmlns="http://purl.oclc.org/dsdl/schematron"> <pattern name="Print both ...
    • Rule-based validation report to XML attachment

      This advanced validation workflow component, Rule-based validation report to XML attachment, can be executed after a rule-based XML validation. If called it will copy the validation report into an XML attachment, which can be emailed. This XML ...
    • Rule-based validation report to text attachment

      This advanced validation workflow component, Rule-based validation report to text attachment, can be executed after a rule-based XML validation. If called it will copy the validation report into a text attachment, which can be emailed. A suggestion ...
    • Rule-based validation report to log

      This advanced validation workflow component, Rule-based validation report to log, can be executed after a rule-based XML validation. If called it will copy the validation report into the system log. A suggestion for implementation can be found here. ...
    • XML

      XML is used widely in InterFormNG2 as both input and output payload in the workflow (and of course in the designer). Please notice, that some workflow components are listed as accepting only XML as input, where they actually also accept an IBM i ...