dateTime抱怨XSD验证中的whiteSpace(lxml)

时间:2011-01-08 01:50:54

标签: python xml

我正在尝试使用XSD验证文档,而lxml在dateTime值中抱怨whiteSpace(尽管它应该折叠它)。我不确定这是否是一个破碎的行为,或者我是否只是在XSD中指定了错误。花了一个小时试图调试这个,所以希望其他人之前经历过类似的行为。

======================================================================
ERROR [0.076s]: test_exports (disqus.importer.tests.tests.SchemaValidation)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dcramer/Development/disqus/disqus/importer/tests/tests.py", line 1098, in test_exports
    xsd.assertValid(export)
  File "lxml.etree.pyx", line 2659, in lxml.etree._Validator.assertValid (src/lxml/lxml.etree.c:99498)
DocumentInvalid: Element '{http://disqus.com}createdAt': '
      2008-06-10T01:32:08
    ' is not a valid value of the atomic type 'xs:dateTime'., line 8

示例XML:

<?xml version="1.0" encoding="utf-8"?>
<disqus xmlns="http://disqus.com" xmlns:dsq="http://disqus.com/disqus-internals" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://disqus.com/api/schemas/1.0/disqus.xsd http://disqus.com/api/schemas/1.0/disqus-internals.xsd">
  <post dsq:id="1">
    <id />
    <message>
      <![CDATA["We want happy paintings. Happy paintings. If you want sad things, watch the news."]]>
    </message>
    <createdAt>
      2008-06-10T01:32:08
    </createdAt>
    <author>
      <email>
        bob@ross.com
      </email>
      <name>
        bobross
      </name>
      <isAnonymous>
        true
      </isAnonymous>
      <username>
        bobross
      </username>
    </author>
    <ipAddress>
      127.0.0.1
    </ipAddress>
    <thread dsq:id="1"/>
  </post>
</disqus>

disqus.xsd:

<?xml version="1.0"?>
<xs:schema targetNamespace="http://disqus.com"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:dsq="http://disqus.com/disqus-internals"
           xmlns="http://disqus.com"
           elementFormDefault="qualified"
>
  <!-- import the dsq namespace -->
  <xs:import namespace="http://disqus.com/disqus-internals"
             schemaLocation="internals.xsd"/>

  <!-- misc types -->
  <xs:simpleType name="identifier">
    <xs:restriction base="xs:string">
      <xs:maxLength value="200"/>
    </xs:restriction>
  </xs:simpleType>

  <!-- root disqus element -->
  <xs:element name="disqus">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="category" type="category" minOccurs="0" maxOccurs="unbounded"/>
        <xs:element name="thread" type="thread" minOccurs="0" maxOccurs="unbounded"/>
        <xs:element name="post" type="post" minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <!-- category element -->
  <xs:complexType name="category">
    <xs:all minOccurs="0">
      <xs:element name="forum" type="xs:string">
        <xs:unique name="categoryID">
          <xs:selector xpath="category"/>
          <xs:field xpath="@title"/>
        </xs:unique>
      </xs:element>
      <xs:element name="title" type="xs:string"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>

  <!-- thread element -->
  <xs:complexType name="thread">
    <xs:all minOccurs="0">
      <xs:element name="id" type="identifier" minOccurs="0">
        <xs:unique name="threadID">
          <xs:selector xpath="thread"/>
          <xs:field xpath="@id"/>
        </xs:unique>
      </xs:element>
      <xs:element name="forum" type="xs:string"/>
      <xs:element name="category">
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="xs:string">
              <xs:attribute ref="dsq:id"/>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="link" type="xs:anyURI"/>
      <xs:element name="title" type="xs:string"/>
      <xs:element name="message" type="xs:string" minOccurs="0"/>
      <xs:element name="author" type="author" minOccurs="0"/>
      <xs:element name="createdAt" type="xs:dateTime"/>
      <xs:element name="isClosed" type="xs:boolean" default="false" minOccurs="0"/>
      <xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>

  <!-- post element -->
  <xs:complexType name="post">
    <xs:all minOccurs="0">
      <xs:element name="id" type="identifier" minOccurs="0">
        <xs:unique name="postID">
          <xs:selector xpath="post"/>
          <xs:field xpath="@id"/>
        </xs:unique>
      </xs:element>
      <xs:element name="parent" minOccurs="0">
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="identifier">
              <xs:attribute ref="dsq:id"/>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="thread">
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="identifier">
              <xs:attribute ref="dsq:id"/>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="author" type="author" minOccurs="0"/>
      <xs:element name="message" type="xs:string"/>
      <xs:element name="ipAddress" type="xs:string" minOccurs="0"/>
      <xs:element name="createdAt" type="xs:dateTime"/>

      <!-- post boolean states states -->
      <xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
      <xs:element name="isApproved" type="xs:boolean" default="true" minOccurs="0"/>
      <xs:element name="isFlagged" type="xs:boolean" default="false" minOccurs="0"/>
      <xs:element name="isSpam" type="xs:boolean" default="false" minOccurs="0"/>
      <xs:element name="isHighlighted" type="xs:boolean" default="false" minOccurs="0"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>

  <!-- author element -->
  <xs:complexType name="author">
    <xs:all minOccurs="0">
      <xs:element name="name" type="xs:string"/>
      <xs:element name="email" type="xs:string"/>
      <xs:element name="link" type="xs:anyURI" minOccurs="0"/>
      <xs:element name="username" type="xs:string" minOccurs="0"/>
      <xs:element name="isAnonymous" type="xs:boolean" default="true" minOccurs="0"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>
</xs:schema>

1 个答案:

答案 0 :(得分:1)

看起来空白是导致问题的原因。你可以从createdAt中删除前导和尾随空格,使其变为

<createdAt>2008-06-10T01:32:08</createdAt>

看看会发生什么?如果这解决了它并且您创建了XML,那么更改XML生成以使其没有空白。否则,如果您负责架构,请尝试将xsd:whitespace更改为“崩溃”并查看是否可以解决此问题。

另一种可能性是它可能需要时区。它应该匹配[-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]所以时区是可选的,但是尝试在那里放一个'Z'以查看是否可以解决问题。这就是this post所暗示的。