Question

我使用以下简单的StAX代码来迭代XML中的所有标记。大小 input.xml＆gt; 100 MB

XMLInputFactory xif = XMLInputFactory.newInstance();
        FileInputStream in = new FileInputStream("input.xml");
        XMLStreamReader xsr = XMLInputFactory.newInstance().createXMLStreamReader(in);

        xsr.next();
        while (xsr.hasNext()) {

            xsr.next();
            if(xsr.isStartElement() || xsr.isEndElement())
                 System.out.println(xsr.getLocalName());            
            }
        }

我收到此错误：

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

请告诉我如何解决这个问题。我读到StAX很好地处理了大量的XML，但是我得到了与DOM Parsers相同的错误。

Answer 1

使用-Xmx参数增加Vm的MaxHeap大小。

java -Xmx512m ....

Answer 2

在运行JVM时定义堆大小

-Xms    initial java heap size
-Xmx    maximum java heap size
-Xmn    the size of the heap for the young generation

示例：

bin/java.exe -Xmn100M -Xms500M -Xmx500M

Answer 3

来自维基百科：传统上，XML API是：

tree based - the entire document is read into memory as a tree structure for random 
access by the calling application
event based - the application registers to receive events as entities are encountered 
within the source document.

StAX was designed as a median between these two opposites. In the StAX metaphor,
the  programmatic  entry point is a cursor that represents a point within the 
document. The application moves the cursor forward - 'pulling' the information from 
the parser as it needs. This is different from an event based API - such as SAX - 
which 'pushes' data to the application - requiring the application to maintain state 
between events as necessary to keep track of location within the document.

因此，对于100M甚至更多 - 我更喜欢SAX - 如果可能使用StAX。

但我在JVM64上尝试了文件大小为2,6GB的代码。没有问题。所以我认为这个问题不是因为文件的大小，而是因为可能是数据。

StAX中的内存不足错误

3 个答案: