我正在尝试使用XSD验证文档,而lxml在dateTime值中抱怨whiteSpace(尽管它应该折叠它)。我不确定这是否是一个破碎的行为,或者我是否只是在XSD中指定了错误。花了一个小时试图调试这个,所以希望其他人之前经历过类似的行为。
======================================================================
ERROR [0.076s]: test_exports (disqus.importer.tests.tests.SchemaValidation)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/dcramer/Development/disqus/disqus/importer/tests/tests.py", line 1098, in test_exports
xsd.assertValid(export)
File "lxml.etree.pyx", line 2659, in lxml.etree._Validator.assertValid (src/lxml/lxml.etree.c:99498)
DocumentInvalid: Element '{http://disqus.com}createdAt': '
2008-06-10T01:32:08
' is not a valid value of the atomic type 'xs:dateTime'., line 8
示例XML:
<?xml version="1.0" encoding="utf-8"?>
<disqus xmlns="http://disqus.com" xmlns:dsq="http://disqus.com/disqus-internals" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://disqus.com/api/schemas/1.0/disqus.xsd http://disqus.com/api/schemas/1.0/disqus-internals.xsd">
<post dsq:id="1">
<id />
<message>
<![CDATA["We want happy paintings. Happy paintings. If you want sad things, watch the news."]]>
</message>
<createdAt>
2008-06-10T01:32:08
</createdAt>
<author>
<email>
bob@ross.com
</email>
<name>
bobross
</name>
<isAnonymous>
true
</isAnonymous>
<username>
bobross
</username>
</author>
<ipAddress>
127.0.0.1
</ipAddress>
<thread dsq:id="1"/>
</post>
</disqus>
disqus.xsd:
<?xml version="1.0"?>
<xs:schema targetNamespace="http://disqus.com"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:dsq="http://disqus.com/disqus-internals"
xmlns="http://disqus.com"
elementFormDefault="qualified"
>
<!-- import the dsq namespace -->
<xs:import namespace="http://disqus.com/disqus-internals"
schemaLocation="internals.xsd"/>
<!-- misc types -->
<xs:simpleType name="identifier">
<xs:restriction base="xs:string">
<xs:maxLength value="200"/>
</xs:restriction>
</xs:simpleType>
<!-- root disqus element -->
<xs:element name="disqus">
<xs:complexType>
<xs:sequence>
<xs:element name="category" type="category" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="thread" type="thread" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="post" type="post" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- category element -->
<xs:complexType name="category">
<xs:all minOccurs="0">
<xs:element name="forum" type="xs:string">
<xs:unique name="categoryID">
<xs:selector xpath="category"/>
<xs:field xpath="@title"/>
</xs:unique>
</xs:element>
<xs:element name="title" type="xs:string"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
<!-- thread element -->
<xs:complexType name="thread">
<xs:all minOccurs="0">
<xs:element name="id" type="identifier" minOccurs="0">
<xs:unique name="threadID">
<xs:selector xpath="thread"/>
<xs:field xpath="@id"/>
</xs:unique>
</xs:element>
<xs:element name="forum" type="xs:string"/>
<xs:element name="category">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute ref="dsq:id"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="link" type="xs:anyURI"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="message" type="xs:string" minOccurs="0"/>
<xs:element name="author" type="author" minOccurs="0"/>
<xs:element name="createdAt" type="xs:dateTime"/>
<xs:element name="isClosed" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
<!-- post element -->
<xs:complexType name="post">
<xs:all minOccurs="0">
<xs:element name="id" type="identifier" minOccurs="0">
<xs:unique name="postID">
<xs:selector xpath="post"/>
<xs:field xpath="@id"/>
</xs:unique>
</xs:element>
<xs:element name="parent" minOccurs="0">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="identifier">
<xs:attribute ref="dsq:id"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="thread">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="identifier">
<xs:attribute ref="dsq:id"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="author" type="author" minOccurs="0"/>
<xs:element name="message" type="xs:string"/>
<xs:element name="ipAddress" type="xs:string" minOccurs="0"/>
<xs:element name="createdAt" type="xs:dateTime"/>
<!-- post boolean states states -->
<xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isApproved" type="xs:boolean" default="true" minOccurs="0"/>
<xs:element name="isFlagged" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isSpam" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isHighlighted" type="xs:boolean" default="false" minOccurs="0"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
<!-- author element -->
<xs:complexType name="author">
<xs:all minOccurs="0">
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="link" type="xs:anyURI" minOccurs="0"/>
<xs:element name="username" type="xs:string" minOccurs="0"/>
<xs:element name="isAnonymous" type="xs:boolean" default="true" minOccurs="0"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
</xs:schema>
答案 0 :(得分:1)
看起来空白是导致问题的原因。你可以从createdAt中删除前导和尾随空格,使其变为
<createdAt>2008-06-10T01:32:08</createdAt>
看看会发生什么?如果这解决了它并且您创建了XML,那么更改XML生成以使其没有空白。否则,如果您负责架构,请尝试将xsd:whitespace更改为“崩溃”并查看是否可以解决此问题。
另一种可能性是它可能需要时区。它应该匹配[-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]所以时区是可选的,但是尝试在那里放一个'Z'以查看是否可以解决问题。这就是this post所暗示的。