我正在尝试使用xsd验证pdf文件文档,其中我将给定的pdf转换为xml并通过模式xsd解析它并验证,但我们假设有一个标题,它有2个小标题我该怎么办?更改为xsd模式,以便对于特定类型的标题,它应该并且必须至少有2个特定文本(单词/句子)的子标题,如何为xsd文件添加条件以验证特定设计的文档?
这是xsd
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="elements">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="element"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="element">
<xs:complexType>
<xs:sequence>
<xs:element ref="pageno"/>
</xs:sequence>
<xs:attribute name="level" use="required" type="xs:integer"/>
<xs:attribute name="title" use="required"/>
<xs:attribute name="type" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="pageno" type="xs:integer"/>
</xs:schema>
这是我用来生成xsd的xml:
<elements>
<element type ="Introduction" level="1" title="Introduction">
<pageno>4</pageno>
</element>
<element type ="Introduction" level="2" title="Enhancements to the HP CSA vCenter Simple Compute">
<pageno>4</pageno>
</element>
<element type ="System requirements" level="1" title="System requirements">
<pageno>5</pageno>
</element>
<element type ="System requirements" level="2" title="Software components">
<pageno>5</pageno>
</element>
<element type ="Configuration requirements" level="1" title="Configuration requirements">
<pageno>7</pageno>
</element>
<element type ="Configuration requirements" level="2" title="Installing content capsule">
<pageno>7</pageno>
</element>
<element type ="Configuring offerings in HP CSA" level="1" title="Configuring offerings in HP CSA">
<pageno>8</pageno>
</element>
<element type ="Configuring offerings in HP CSA" level="2" title="Configuring subscriber options">
<pageno>8</pageno>
</element>
<element type ="Configuring subscriber options" level="2" title="Adding providers">
<pageno>8</pageno>
</element>
<element type ="Adding providers" level="2" title="Associating resource offerings with providers">
<pageno>9</pageno>
</element>
<element type ="Associating resource offerings with providers" level="2" title="Changing component properties">
<pageno>10</pageno>
</element>
<element type ="Changing component properties" level="2" title="Creating the service offering">
<pageno>12</pageno>
</element>
<element type ="Creating the service offering" level="2" title="Publishing the service offering">
<pageno>13</pageno>
</element>
<element type ="Publishing the service offering" level="3" title="Publishing service offering to a Catalog">
<pageno>13</pageno>
</element>
<element type ="Subscribing to the service" level="1" title="Subscribing to the service">
<pageno>14</pageno>
</element>
<element type ="Subscribing to the service" level="2" title="Canceling a subscription">
<pageno>14</pageno>
</element>
<!-- <element type ="adasdasd" level = "5" title= "dasdsad">
</element> -->
<element type ="Limitations" level="1" title="Limitations">
<pageno>16</pageno>
</element>
<element type ="Appendix A: HP Operations Orchestration flows" level="1" title="Appendix A: HP Operations Orchestration flows">
<pageno>17</pageno>
</element>
<element type ="Appendix B: Integrating with IP Address Management solutions" level="1" title="Appendix B: Integrating with IP Address Management solutions">
<pageno>19</pageno>
</element>
<element type ="Additional resources" level="1" title="Additional resources">
<pageno>20</pageno>
</element>
<element type ="Send Documentation Feedback" level="1" title="Send Documentation Feedback">
<pageno>21</pageno>
</element>
</elements>
如果您认为我缺乏明确性,那么请告诉我,我会回答任何问题。 谢谢
答案 0 :(得分:0)
您可以定义自己的Schema类型,它是必须包含特定文本的标题,如果您的实现支持,则使用XSD 1.1的条件类型赋值功能。
更广泛支持的方法是在架构中嵌入Schematron规则以执行您想要的操作。
请记住,如果过度约束输入,架构可能会变得脆弱 - 也就是说,如果发生变化,很难维护。