我正在用Java编写RESTFUL Web服务。 我们的想法是“删减”一个XML文档并删除所有不需要的内容(~98%),只保留我们感兴趣的标签,同时保持文档的结构,如下所示(我无法提供)出于保密原因,实际的XML内容):
<sear:SEGMENTS xmlns="http://www.exlibrisgroup.com/xsd/primo/primo_nm_bib" xmlns:sear="http://www.exlibrisgroup.com/xsd/jaguar/search">
<sear:JAGROOT>
<sear:RESULT>
<sear:DOCSET IS_LOCAL="true" TOTAL_TIME="176" LASTHIT="9" FIRSTHIT="0" TOTALHITS="262" HIT_TIME="11">
<sear:DOC SEARCH_ENGINE_TYPE="Local Search Engine" SEARCH_ENGINE="Local Search Engine" NO="1" RANK="0.086826384" ID="2347460">
[
<PrimoNMBib>
<record>
<display>
<title></title>
</display>
<sort>
<author></author>
</sort>
</record>
</PrimoNMBib>
]
</sear:DOC>
</sear:DOCSET>
</sear:RESULT>
</sear:JAGROOT>
</sear:SEGMENTS>
当然,这只是我们感兴趣的标签的结构 - 还有数百个标签,但它们无关紧要。
方括号([])不是XML的一部分,表示该元素是子列表的元素,并且不止一次出现 - 每次匹配来自RESTFUL服务的匹配。
这就是说,我的包含XSLT样式表的Java代码如下:
import java.io.StringReader;
import java.io.StringWriter;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.TransformerFactoryConfigurationError;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public String cutXML() throws TransformerFactoryConfigurationError, TransformerException
{
String xmlSourceResource = this.xml; // where this.xml is the full XML string of structure as presented above
String xsltResource =
"<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\" xmlns:sear=\"http://www.exlibrisgroup.com/xsd/jaguar/search\">" +
" <xsl:output method=\"xml\" version=\"1.0\" omit-xml-declaration=\"no\" encoding=\"UTF-8\" indent=\"yes\"/>" +
" <xsl:strip-space elements=\"*\"/>" +
" <sear:WhiteList>" +
" <name>title</name>" +
" <name>author</name>" +
" </sear:WhiteList>" +
" <xsl:template match=\"node()|@*\">" +
" <xsl:copy>" +
" <xsl:apply-templates select=\"node()|@*\"/>" +
" </xsl:copy>" +
" </xsl:template>" +
" <xsl:template match=\"*[not(descendant-or-self::*[name()=document('')/*/sear:WhiteList/*])]\"/>" +
"</xsl:stylesheet>";
StringWriter xmlResultResource = new StringWriter(); // where the transformed/stripped-down XML will be written
Transformer xmlTransformer = TransformerFactory.newInstance().newTransformer(new StreamSource(new StringReader(xsltResource))); // create transformer object with XSLT given
xmlTransformer.transform(new StreamSource(new StringReader(xmlSourceResource)), new StreamResult(xmlResultResource)); // transform XML with transformer and write into result StringWriter
return xmlResultResource.getBuffer().toString(); // return transformed XML string
}
不幸的是,当我在服务器上运行它时,我得到的只是一个空的源页面,就像转换结果是一个空字符串一样。
服务器的日志文件首先提供了以下信息:
[#|2012-04-26T18:26:24.967+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.PackagesResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|Scanning for root resource and provider classes in the packages: dk.kb.mobileservice|#]
[#|2012-04-26T18:26:24.969+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.ScanningResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|Root resource classes found: class dk.kb.mobileservice.Middle|#]
[#|2012-04-26T18:26:24.970+0000|INFO|glassfish3.1.2|com.sun.jersey.api.core.ScanningResourceConfig|_ThreadID=23;_ThreadName=Thread-2;|No provider classes found.|#]
[#|2012-04-26T18:26:24.978+0000|INFO|glassfish3.1.2|com.sun.jersey.server.impl.application.WebApplicationImpl|_ThreadID=23;_ThreadName=Thread-2;|Initiating Jersey application, version 'Jersey: 1.11 12/09/2011 10:27 AM'|#]
[#|2012-04-26T18:26:25.192+0000|INFO|glassfish3.1.2|javax.enterprise.system.container.web.com.sun.enterprise.web|_ThreadID=23;_ThreadName=Thread-2;|WEB0671: Loading application [kb2] at [/kb2]|#]
[#|2012-04-26T18:26:25.200+0000|INFO|glassfish3.1.2|javax.enterprise.system.tools.admin.org.glassfish.deployment.admin|_ThreadID=23;_ThreadName=Thread-2;|kb2 was successfully deployed in 2,293 milliseconds.|#]
[#|2012-04-26T18:26:46.263+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=20;_ThreadName=Thread-2;|SystemId Unknown; Line #0; Column #0; java.lang.NullPointerException |#]
[#|2012-04-26T18:31:09.772+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|SystemId Unknown; Line #0; Column #0; java.lang.NullPointerException |#]
现在它返回以下问题:
[#|2012-04-27T00:05:07.731+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|Error on line 1 column 1 of file:/root/webglassfish3/glassfish/domains/domain1/config/: SXXP0003: Error reported by XML parser: Content is not allowed in prolog.|#]
[#|2012-04-27T00:05:07.732+0000|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|Recoverable error on line 1 SXXP0003: org.xml.sax.SAXParseException: Content is not allowed in prolog.|#]
我已经测试了XML文件并通过浏览器对其进行了转换,并且它有效,所以我认为它不是XML,也不是XSLT样式表的错误......这似乎是一个Java问题。
当我在GlassFish之外的整个XML上运行上述Java代码时,我收到以下错误:
Exception in thread "main" java.lang.VerifyError: (class: GregorSamsa$0, method: test signature: (IIIILcom/sun/org/apache/xalan/internal/xsltc/runtime/AbstractTranslet;Lcom/sun/org/apache/xml/internal/dtm/DTMAxisIterator;)Z) Incompatible type for getting or setting field
at GregorSamsa.applyTemplates()
at GregorSamsa.applyTemplates()
at GregorSamsa.transform()
at com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet.transform(AbstractTranslet.java:609)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:729)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:340)
at XML2JSON.cutXML(XML2JSON.java:105)
at XML2JSON.main(XML2JSON.java:31)
答案 0 :(得分:0)
Content is not allowed in prolog.
通常意味着您在XML开始之前就拥有了内容。 XML解析器期望看到XML声明:<?xml version="1.0"?>
,或者如果省略,那么只是文档元素的开头(即<sear:SEGMENTS>
)
打印/记录this.xml
的内容,并确认在XML声明或文档元素之前没有前导空格字符或其他内容。