无效的.dtd文件:[致命错误]:32:43:属性类型声明中需要属性类型" pagenumber" for element" meterdocument"

时间:2015-03-17 13:41:38

标签: java xml dtd

我不知道为什么这不起作用。我正在尝试解析一些xml文件,然后再引入.dtd文件。不幸的是,这不起作用,因为它抛出org.xml.sax.SAXParseException

    try {
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

        dBuilder.setEntityResolver(new EntityResolver() {
            @Override
            public InputSource resolveEntity(String publicId, String systemId)
                    throws SAXException, IOException {

                if (systemId.contains("meter.dtd") == false) {
                    return null;
                }

                String path = null;
                try {
                    File dtd = Resource.getFileFromResource("meter_corpus/sgml_dtds/meter.dtd");
                    path = dtd.getAbsolutePath();
                } catch (Exception e) {
                    e.printStackTrace();
                }   

                if(path == null) {
                    return null;
                }
                return new InputSource(new FileReader(path));
            }
        });

        xmlDocument = dBuilder.parse(xmlFile);
        xmlDocument.getDocumentElement().normalize();
    }catch (Exception e) {
        e.printStackTrace();
    }

meter.dtd文件:

<!ELEMENT meterdocument  - - (title?,body)>
<!ATTLIST meterdocument 
                         classification CDATA    #IMPLIED
                         pagenumber     NUMBER   #IMPLIED
                         filename       CDATA    #REQUIRED 
                         newspaper      CDATA    #REQUIRED 
                         domain         CDATA    #REQUIRED
                         date           CDATA    #REQUIRED
                         catchline      CDATA    #REQUIRED >
<!ELEMENT title          - - (#PCDATA)>
<!ELEMENT body           - - (((verbatim | rewrite | new)+) | unclassified)>
<!ELEMENT verbatim       - - (#PCDATA)>
<!ATTLIST verbatim       PAsource   CDATA #IMPLIED>
<!ELEMENT rewrite        - - (#PCDATA)>
<!ATTLIST rewrite        PAsource   CDATA #IMPLIED>
<!ELEMENT new            - - (#PCDATA)>
<!ATTLIST new            PAsource   CDATA #IMPLIED>
<!ELEMENT unclassified   - - (#PCDATA)>

应该是可解析的文件:

<!DOCTYPE meterdocument SYSTEM "meter.dtd" [
]>

<meterdocument  filename="/meter_corpus/newspapers/annotated/courts/01.03.00/football/football382_star.sgml" newspaper="star" domain="courts" classification="wholly-derived" pagenumber="12" date="01.03.00" catchline="football">

<body>
<Verbatim PAsource="" >SIX football fans will</Verbatim>
<Rewrite PAsource="" > find out </Rewrite>
<Rewrite PAsource="" >today </Rewrite>
<Rewrite PAsource="" >whether they have won their fight to stop </Rewrite>
<Verbatim PAsource="" >Newcastle United </Verbatim>
<Rewrite PAsource="" >moving </Rewrite>
<Verbatim PAsource="" >their seats. </Verbatim>
<Verbatim PAsource="" >Mr Justice Blackburne, sitting at Newcastle High Court, </Verbatim>
<Rewrite PAsource="" >will reveal </Rewrite>
<Verbatim PAsource="" >his </Verbatim>
<Rewrite PAsource="" >decision over the </Rewrite>
<Verbatim PAsource="" >season ticket holders' </Verbatim>
<Rewrite PAsource="" >battle </Rewrite>
<Verbatim PAsource="" >at noon.
</Verbatim>
</body>
</meterdocument>

和包含错误行的完整堆栈跟踪:

[Fatal Error] :32:43: The attribute type is required in the declaration of attribute "pagenumber" for element "meterdocument".
org.xml.sax.SAXParseException; lineNumber: 32; columnNumber: 43; The attribute type is required in the declaration of attribute "pagenumber" for element "meterdocument".
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:348)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
    at eval.meter.METERDocument.<init>(METERDocument.java:68)
    at eval.meter.METERCorpus.runMeterCorpusTest(METERCorpus.java:194)
    at eval.meter.METERCorpus.main(METERCorpus.java:92)
java.lang.NullPointerException
    at eval.meter.METERDocument.<init>(METERDocument.java:74)
    at eval.meter.METERCorpus.runMeterCorpusTest(METERCorpus.java:194)
    at eval.meter.METERCorpus.main(METERCorpus.java:92)

我必须做什么才能正确解析此文件?

1 个答案:

答案 0 :(得分:0)

尝试NMTOKEN(名称标记)或其他内容,而不是NUMBER#IMPLIED可能值得一个值。