Question

我有一个类似于以下的XML文档，其中我有两种类型的元素。第一种类型只能包含一些有序的节点集;第二种类型只能包含一些其他有序的节点集。这些都在根元素下混合。例如：

<root>
  <!-- any number of Type One and Type Two -->
  <item>
    <type>Type One</type>
    <a />
    <b />
  </item>
  <item>
    <type>Type Two</type>
    <d />
    <e />
  </item>
</root>

我想描述一下这个文件。是否有某种形式的<xs:choice>或类似形式允许在<xs:complexType>之间进行选择？

例如，以下内容描述了我想要做的事情，但它不是有效的XSD，因为它失败了unique particle attribution rule：

<xs:element name="root">
  <xs:complexType>
    <xs:choice maxOccurs="unbounded">
      <!-- failure: violates UPA here -->
      <xs:element name="item" type="Type One" />
      <xs:element name="item" type="Type Two" />
    </xs:choice>
  </xs:complexType>
</xs:element>

<xs:complexType name="Type One">
  <xs:sequence>
    <xs:element name="type" type="typeOneId" />
    <xs:element name="a" type="xs:string" />
    <xs:element name="b" type="xs:string" />
  </xs:sequence>
</xs:complexType>

<xs:simpleType name="typeOneId">
  <xs:restriction base="xs:string">
    <xs:enumeration value="Type One" />
  </xs:restriction>
</xs:simpleType>

<xs:complexType name="Type Two">
  <xs:sequence>
    <xs:element name="type" type="typeTwoId" />
    <xs:element name="d" type="xs:string" />
    <xs:element name="e" type="xs:string" />
  </xs:sequence>
</xs:complexType>

<xs:simpleType name="typeTwoId">
  <xs:restriction base="xs:string">
    <xs:enumeration value="Type Two" />
  </xs:restriction>
</xs:simpleType>

Answer 1

最简单且技术上最好的方法是更改您显示的XML表单。你有两种不同类型的元素;给他们两个不同的名字。您的样本将采用以下形式：

<root>
  <!-- any number of TypeOne and TypeTwo elements -->
  <typeOne>
    <a />
    <b />
  </typeOne>
  <typeTwo>
    <d />
    <e />
  </typeTwo>
</root>

使用xsd：choice很容易定义，但是你说你不知道怎么做，所以也许我应该给出一个简单的例子（未经测试）：

<element name="root" type="tns:root"/>
<element name="typeOne" type="tns:typeOne"/>
<element name="typeTwo" type="tns:typeTwo"/>
<complexType name="root">
  <choice minOccurs="0" maxOccurs="unbounded">
    <element ref="tns:typeOne"/>
    <element ref="tns:typeTwo"/>
  </choice>
</complexType>
<complexType name="typeOne">
  <sequence>
    <element ref="tns:a"/>
    <element ref="tns:b"/>
  </sequence>
</complexType>
<complexType name="typeTwo">
  <sequence>
    <element ref="tns:d"/>
    <element ref="tns:e"/>
  </sequence>
</complexType>

如果我们假设您确实想要确切地说明您要显示的文档，那么当然根类型可以更严格地限制：

<complexType name="root">
  <sequence>
    <element ref="tns:typeOne"/>
    <element ref="tns:typeTwo"/>
  </sequence>
</complexType>

（当然，如果您关心的只是您展示的单个文档有效，那么仅由声明<element name="root"/>组成的模式就可以了。我猜你没有做过。这意味着它。）

如果这对于你想要构建的系统来说过于简单和简单，并且你想用同一个名称调用root的所有子节点，为了使事情更具挑战性，那么另一种方法是提供一个简单的标签。有两种类型，我们用两个空的标志元素来区分，称为flag1和flag2。输入变为：

<root>
  <!-- any number of TypeOne and TypeTwo elements,
       but we call them all by the same name -->
  <item>
    <flag1/>
    <a />
    <b />
  </item>
  <item>
    <flag2/>
    <d />
    <e />
  </item>
</root>

架构文档的相关部分可能如下所示：

<element name="root" type="tns:root"/>
<element name="item" type="tns:item"/>
<complexType name="root">
  <choice minOccurs="0" maxOccurs="unbounded">
    <element ref="tns:item"/>
  </choice>
</complexType>
<complexType name="item">
  <choice>
    <sequence>
      <element ref="tns:flag1"/>
      <element ref="tns:a"/>
      <element ref="tns:b"/>
    </sequence>
    <sequence>
      <element ref="tns:flag2"/>
      <element ref="tns:d"/>
      <element ref="tns:e"/>
    </sequence>
  </choice>
</complexType>

实际上，在您提供的示例数据中，不需要任何标记元素：类型一个元素始终以a开头，使用d输入两个元素。因此，项目的内容模型是序列a，b和序列d，e之间的选择。标志元素在更复杂的场景中可能是有用的。输入变为：

<root>
  <!-- any number of TypeOne and TypeTwo elements,
       but we call them all by the same name -->
  <item>
    <a />
    <b />
  </item>
  <item>
    <d />
    <e />
  </item>
</root>

在您的评论中，您添加了您尝试匹配现有数据流的XML，这看起来与您在问题中显示的XML基本相似。我上面所说的仍然坚持：最简单和最好的方法是修复XML的破碎设计。但是，如果我们假设您不能或不会说服该XML数据流的生成者产生一些不那么反常的结构，那么最好的办法就是使用XSD 1.1断言来约束结构。这使得其他XSD工具（如数据绑定工具）的生活更加困难，因为理解断言的含义很像在一个不太了解的形式逻辑中证明定理，并且大多数数据绑定工具或编辑器都无法做到很多与架构。

但您可以使用以下内容验证您的输入：

<?xml version="1.0" encoding="UTF-8"?>
<schema 
  xmlns="http://www.w3.org/2001/XMLSchema"
  xmlns:tns="http://example.com/tns"
  targetNamespace="http://example.com/tns"
  elementFormDefault="qualified"> 

  <element name="root" type="tns:root"/>
  <element name="item" type="tns:typeOne-or-typeTwo"/>
  <element name="type" type="tns:typestring"/>
  <element name="a"/>
  <element name="b"/>
  <element name="d"/>
  <element name="e"/>

  <simpleType name="typestring">
    <restriction base="string">
      <enumeration value="Type One"/>
      <enumeration value="Type Two"/>
    </restriction>
  </simpleType>

  <complexType name="root">
    <sequence>
      <element ref="tns:item" maxOccurs="unbounded"/>
        </sequence>
  </complexType>

  <complexType name="typeOne-or-typeTwo">
    <sequence>
      <element ref="tns:type"/>
      <choice>
        <sequence>
          <element ref="tns:a"/>
          <element ref="tns:b"/>
        </sequence>
        <sequence>
          <element ref="tns:d"/>
          <element ref="tns:e"/>
        </sequence>
      </choice>      
    </sequence>
    <assert test="(string(./tns:type) = 'Type One' 
                       and child::*[2]/self::tns:a)
                   or
                  (string(./tns:type) = 'Type Two' 
                       and child::*[2]/self::tns:d)">
      <annotation>
        <documentation>
          If the initial 'type' element contains the string 
          'Type One', then the next child should be 
          element 'a' (followed in any valid instance
          by a 'b'); if the initial 'type' element contains 
          the string 'Type Two', then the next child should 
          be element 'd'.  Etc.
        </documentation>
      </annotation>
    </assert>    
  </complexType>
</schema>

这需要XSD 1.1支持，可以从Saxon，Xerces和XML Spy获得。

如何在XML Schema中描述集合/ complexTypes的选择？

1 个答案: