这是问题Is there some equivalent in Java to Ruby's Nokogiri::XML::EntityDecl?
的后续内容我有一个简单的DAISY DTBook XML文件(虽然特定的DTD对我的问题并不重要,但这是旧版书籍中使用的实际标准。)它包含来自DTBook和MathML名称空间的XML。 / p>
请注意,DTD声明遵循我从specification for MathML in DAISY复制的约定,它使用组合的DTD,同时引用DTBook标准的外部DTD并为MathML标准添加一些内部ENTITY定义。
percent <- merge(aggregate(calls1["CallsHandled"],calls1["MON1_12"], sum),
aggregate(calls1["CallsHandled"], calls1[c("MON1_12","QUEUE")], sum),
by = "MON1_12")
percent[["PercCallsMo"]] <- percent[["CallsHandled.y"]] / percent[["CallsHandled.x"]]
merge(calls1, percent[c("MON1_12", "QUEUE", "PercCallsMo")])
我使用以下Java代码读取文档并将其打印出来。我第一次使用JDOM 1.1.3(因为这个大项目的约束),但我也尝试使用JDOM 2.0.6。
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE dtbook PUBLIC "-//NISO//DTD dtbook 2005-2//EN"
"http://www.daisy.org/z3986/2005/dtbook-2005-2.dtd"
[
<!ENTITY % MATHML.prefixed "INCLUDE" >
<!ENTITY % MATHML.prefix "m">
<!ENTITY % MATHML.Common.attrib
"xlink:href CDATA #IMPLIED
xlink:type CDATA #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
id ID #IMPLIED
xref IDREF #IMPLIED
other CDATA #IMPLIED
xmlns:dtbook CDATA #FIXED 'http://www.daisy.org/z3986/2005/dtbook/'
dtbook:smilref CDATA #IMPLIED"
>
<!ENTITY % mathML2 PUBLIC "-//W3C//DTD MathML 2.0//EN"
"http://www.w3.org/Math/DTD/mathml2/mathml2.dtd"
>
%mathML2;
<!ENTITY % externalFlow "| m:math">
<!ENTITY % externalNamespaces "xmlns:m CDATA #FIXED
'http://www.w3.org/1998/Math/MathML'">
]
>
<dtbook xmlns="http://www.daisy.org/z3986/2005/dtbook/" xmlns:m="http://www.w3.org/1998/Math/MathML"
version="2005-2" xml:lang="eng">
<head></head>
<book>
<frontmatter><doctitle></doctitle></frontmatter>
<bodymatter>
<level1>
<p>Test</p>
<m:math xmlns:dtbook="http://www.daisy.org/z3986/2005/dtbook/"
id="math0001" dtbook:smilref="nativemathml.smil#math0001" altimg="nativemathml0001.png"
alttext="sigma-summation UnderScript i equals zero OverScript infinity EndScripts x Subscript i">
<m:mrow>
<m:mstyle displaystyle='true'>
<m:munderover>
<m:mo>∑</m:mo>
<m:mrow>
<m:mi>i</m:mi>
<m:mo>=</m:mo>
<m:mn>0</m:mn>
</m:mrow>
<m:mi>∞</m:mi>
</m:munderover>
<m:mrow>
<m:msub>
<m:mi>x</m:mi>
<m:mi>i</m:mi>
</m:msub>
</m:mrow>
</m:mstyle>
</m:mrow>
</m:math>
</level1>
</bodymatter>
<rearmatter><level1><p></p></level1></rearmatter>
</book>
</dtbook>
当我使用@Test
public void buildDTD2()
throws IOException, JDOMException
{
final PathMatchingResourcePatternResolver pmrpr = new PathMatchingResourcePatternResolver();
final File file = pmrpr.getResource("daisy/mathmldtdtemplate.xml").getFile();
final String uri = file.toURI().toString();
final InputStream stream = new BufferedInputStream(new FileInputStream(file));
final SAXBuilder saxBuilder = new SAXBuilder();
saxBuilder.setValidation(true);
saxBuilder.setFeature("http://apache.org/xml/features/validation/schema", true);
final InputSource source = new InputSource(new BufferedInputStream(stream));
source.setSystemId(uri);
final Document doc = saxBuilder.build(source);
String xml2 = new XMLOutputter().outputString(doc);
System.out.println(xml2);
System.out.println("Internal Subset: " + doc.getDocType().getInternalSubset());
}
在最后一行打印System.out.println
时,不打印任何内容。当我打印出整个文档时,我得到了这个:
getInternalSubset()
ENTITY定义消失了!我是否错过了一些允许我维护的选项?我该如何维护它们?当我们处理这些文件时,我们可能需要读取它们并将它们写出来几次而不会丢失这个DTD。
答案 0 :(得分:0)
经过进一步研究,我找到了a solution on the jdom-interest list。
添加语句saxBuilder.setExpandEntities(false);
,根据Laurent Bihanic,将强制注册DeclHandler。
@Test
public void buildDTD2()
throws IOException, JDOMException
{
final PathMatchingResourcePatternResolver pmrpr = new PathMatchingResourcePatternResolver();
final File file = pmrpr.getResource("daisy/mathmldtdtemplate.xml").getFile();
final String uri = file.toURI().toString();
final InputStream stream = new BufferedInputStream(new FileInputStream(file));
final SAXBuilder saxBuilder = new SAXBuilder();
saxBuilder.setValidation(true);
saxBuilder.setFeature("http://apache.org/xml/features/validation/schema", true);
saxBuilder.setExpandEntities(false);
final InputSource source = new InputSource(new BufferedInputStream(stream));
source.setSystemId(uri);
final Document doc = saxBuilder.build(source);
String xml2 = new XMLOutputter().outputString(doc);
System.out.println(xml2);
System.out.println("Internal Subset: " + doc.getDocType().getInternalSubset());
}
这有效;现在内部子集被读入并在&#34;内部子集:&#34;。
之后打印出来