即使我正在使用流式传输,XSLT也会内存不足

时间:2015-12-10 22:46:13

标签: xml xslt saxon xslt-3.0

我有以下样式表:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:mode streamable="yes"/>
    <xsl:output method="text"/>

    <xsl:template match="/">
        <xsl:value-of select="//w" separator="&#10;"/>
    </xsl:template>

</xsl:stylesheet>

我在命令行中运行以下命令:

java -jar saxon9he.jar korpus.xml xslt.xml > korpus.txt

即使我在样式表中指定了流式传输,也会出现内存不足错误。

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at net.sf.saxon.tree.tiny.TinyTree.ensureAttributeCapacity(TinyTree.java:277)
    at net.sf.saxon.tree.tiny.TinyTree.addAttribute(TinyTree.java:757)
    at net.sf.saxon.tree.tiny.TinyBuilder.attribute(TinyBuilder.java:302)
    at net.sf.saxon.event.ReceivingContentHandler.startElement(ReceivingContentHandler.java:366)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:504)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:401)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2763)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:647)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:513)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:815)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:744)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:128)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1208)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:543)
    at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:444)
    at net.sf.saxon.event.Sender.send(Sender.java:177)
    at net.sf.saxon.Controller.transform(Controller.java:1872)
    at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:553)
    at net.sf.saxon.Transform.processFile(Transform.java:1178)
    at net.sf.saxon.Transform.doTransform(Transform.java:765)
    at net.sf.saxon.Transform.main(Transform.java:77)

我的korpus.xml文件大3.61 GB。

我做错了什么?

1 个答案:

答案 0 :(得分:3)

您正在使用不支持流媒体的Saxon-HE。你需要Saxon-EE。

在xsl:stylesheet元素上设置version =“3.0”,在命令行上设置-xsltversion:3.0也是个好主意。由于3.0还不是W3C推荐标准,因此除非另有要求,否则Saxon将作为XSLT 2.0处理器运行。遗憾的是,XSLT 2.0处理器忽略了“向前兼容性”规则下的xsl:mode声明。