使用Python中的lxml进行Schematron验证:如何检索验证错误?

时间:2014-11-26 13:14:35

标签: python validation lxml schematron

我正在尝试使用lxml进行一些Schematron验证。对于我正在使用的特定应用程序,重要的是报告任何未通过验证的测试。 lxml documentation提到了validation_report属性对象的存在。我认为这应该包含我正在寻找的信息,但我无法弄清楚如何使用它。这是一些示例代码,用于演示我的问题(改编自http://lxml.de/validation.html#id2;使用Python 2.7.4进行测试):

import StringIO
from lxml import isoschematron
from lxml import etree

def main():

    # Schema
    f = StringIO.StringIO('''\
    <schema xmlns="http://purl.oclc.org/dsdl/schematron" >
    <pattern id="sum_equals_100_percent">
    <title>Sum equals 100%.</title>
    <rule context="Total">
    <assert test="sum(//Percent)=100">Sum is not 100%.</assert>
    </rule>
    </pattern>
    </schema>
    ''')

    # Parse schema
    sct_doc = etree.parse(f)
    schematron = isoschematron.Schematron(sct_doc, store_report = True)

    # XML to validate - validation will fail because sum of numbers
    # not equal to 100 
    notValid = StringIO.StringIO('''\
        <Total>
        <Percent>30</Percent>
        <Percent>30</Percent>
        <Percent>50</Percent>
        </Total>
        ''')
    # Parse xml
    doc = etree.parse(notValid)

    # Validate against schema
    validationResult = schematron.validate(doc)

    # Validation report (assuming here this is where reason 
    # for validation failure is stored, but perhaps I'm wrong?)
    report = isoschematron.Schematron.validation_report

    print("is valid: " + str(validationResult))
    print(dir(report.__doc__))

main()

现在,从validationResult的值我可以看到验证失败(正如预期的那样),所以接下来我想知道为什么。第二个印刷语句的结果给了我:

['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__
format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__get
slice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mo
d__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook
__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center',
 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index
', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper',
'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', '
rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', '
strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

根据文档和this related question,这就我所知。很可能是我忽略的非常明显的事情?

1 个答案:

答案 0 :(得分:7)

好的,所以Twitter上的某个人给了我一个建议,让我意识到我错误地将架构师类的引用都弄错了。由于似乎没有任何明确的例子,我将在下面分享我的工作解决方案:

import StringIO
from lxml import isoschematron
from lxml import etree

def main():
    # Example adapted from http://lxml.de/validation.html#id2

    # Schema
    f = StringIO.StringIO('''\
    <schema xmlns="http://purl.oclc.org/dsdl/schematron" >
    <pattern id="sum_equals_100_percent">
    <title>Sum equals 100%.</title>
    <rule context="Total">
    <assert test="sum(//Percent)=100">Sum is not 100%.</assert>
    </rule>
    </pattern>
    </schema>
    ''')

    # Parse schema
    sct_doc = etree.parse(f)
    schematron = isoschematron.Schematron(sct_doc, store_report = True)

    # XML to validate - validation will fail because sum of numbers 
    # not equal to 100 
    notValid = StringIO.StringIO('''\
        <Total>
        <Percent>30</Percent>
        <Percent>30</Percent>
        <Percent>50</Percent>
        </Total>
        ''')
    # Parse xml
    doc = etree.parse(notValid)

    # Validate against schema
    validationResult = schematron.validate(doc)

    # Validation report 
    report = schematron.validation_report

    print("is valid: " + str(validationResult))
    print(type(report))
    print(report)

main()

报告上的 print 语句现在会产生以下输出:

 <?xml version="1.0" standalone="yes"?>
<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:schold="http://www.ascc.net/xml/schematron" xmlns:sch="http://www.ascc.net/xml/schematron" xmlns:iso="http://purl.oclc.org/dsdl/schematron" title="" schemaVersion="">
  <!--   
           
           
         -->
  <svrl:active-pattern id="sum_equals_100_percent" name="Sum equals 100%."/>
  <svrl:fired-rule context="Total"/>
  <svrl:failed-assert test="sum(//Percent)=100" location="/Total">
    <svrl:text>Sum is not 100%.</svrl:text>
  </svrl:failed-assert>
</svrl:schematron-output>

这正是我想要的!