Question

我收到了一个java.lang.OutOfMemoryError异常：Java堆空间。

解析完成后，我正在解析XML文件，存储数据并输出XML文件。

我有点惊讶得到这样的错误，因为原始的XML文件根本不长。

代码：http://d.pr/RSzp 档案：http://d.pr/PjrE

Answer 1

可以尝试在eclipse.ini文件中设置（我假设使用Eclipse）-Xms和-Xmx值更高。

离）

-vmargs

-Xms128m //（初始堆大小）

-Xmx256m //（最大堆大小）

Answer 2

如果这是你想要完成的一次性事情，我会尝试Jason的建议，即增加Java可用的内存。

您正在构建一个非常大的对象列表，然后循环遍历该列表以输出String，然后将该String写入文件。列表和字符串可能是您使用高内存的原因。您可以以更加面向流的方式重新组织代码。在开始时打开文件输出，然后在解析每个Centroid时为其写入XML。那么你就不需要保留它们的大清单了，你也不需要保存代表所有XML的大字符串。

Answer 3

转储堆并进行分析。您可以使用-XX:+HeapDumpOnOutOfMemoryError系统属性在内存错误上配置自动堆转储。

http://www.oracle.com/technetwork/java/javase/index-137495.html

https://www.infoq.com/news/2015/12/OpenJDK-9-removal-of-HPROF-jhat

<击> http://blogs.oracle.com/alanb/entry/heap_dumps_are_back_with

Answer 4

解释为什么你有一个OutOfMemoryError的简短回答，对于你在已经“注册”的质心上循环的文件中找到的每个质心来检查它是否已经知道（添加新的或更新已经注册的质心）。但是对于每次失败的比较，您都会添加新质心的新副本。因此，对于每个新的质心，它添加它的次数与列表中已有的质心一样多，那么你会遇到你添加的第一个，你更新它并离开循环......

以下是一些重构代码：

public class CentroidGenerator {

    final Map<String, Centroid> centroids = new HashMap<String, Centroid>();

    public Collection<Centroid> getCentroids() {
        return centroids.values();
    }

    public void nextItem(FlickrDoc flickrDoc) {

        final String event = flickrDoc.getEvent();
        final Centroid existingCentroid = centroids.get(event);

        if (existingCentroid != null) {
            existingCentroid.update(flickrDoc);
        } else {
            final Centroid newCentroid = new Centroid(flickrDoc);
            centroids.put(event, newCentroid);
        }
    }


    public static void main(String[] args) throws IOException, SAXException {

        // instantiate Digester and disable XML validation
        [...]


        // now that rules and actions are configured, start the parsing process
        CentroidGenerator abp = (CentroidGenerator) digester.parse(new File("PjrE.data.xml"));

        Writer writer = null;

        try {
            File fileOutput = new File("centroids.xml");
            writer = new BufferedWriter(new FileWriter(fileOutput));
            writeOuput(writer, abp.getCentroids());

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (writer != null) {
                    writer.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

    }

    private static void writeOuput(Writer writer, Collection<Centroid> centroids) throws IOException {

        writer.append("<?xml version='1.0' encoding='utf-8'?>" + System.getProperty("line.separator"));
        writer.append("<collection>").append(System.getProperty("line.separator"));

        for (Centroid centroid : centroids) {
            writer.append("<doc>" + System.getProperty("line.separator"));

            writer.append("<title>" + System.getProperty("line.separator"));
            writer.append(centroid.getTitle());
            writer.append("</title>" + System.getProperty("line.separator"));

            writer.append("<description>" + System.getProperty("line.separator"));
            writer.append(centroid.getDescription());
            writer.append("</description>" + System.getProperty("line.separator"));

            writer.append("<time>" + System.getProperty("line.separator"));
            writer.append(centroid.getTime());
            writer.append("</time>" + System.getProperty("line.separator"));

            writer.append("<tags>" + System.getProperty("line.separator"));
            writer.append(centroid.getTags());
            writer.append("</tags>" + System.getProperty("line.separator"));

            writer.append("<geo>" + System.getProperty("line.separator"));

            writer.append("<lat>" + System.getProperty("line.separator"));
            writer.append(centroid.getLat());
            writer.append("</lat>" + System.getProperty("line.separator"));

            writer.append("<lng>" + System.getProperty("line.separator"));
            writer.append(centroid.getLng());
            writer.append("</lng>" + System.getProperty("line.separator"));

            writer.append("</geo>" + System.getProperty("line.separator"));

            writer.append("</doc>" + System.getProperty("line.separator"));

        }
        writer.append("</collection>" + System.getProperty("line.separator") + System.getProperty("line.separator"));

    }

    /**
     * JavaBean class that holds properties of each Document entry. It is important that this class be public and
     * static, in order for Digester to be able to instantiate it.
     */
    public static class FlickrDoc {
        private String id;
        private String title;
        private String description;
        private String time;
        private String tags;
        private String latitude;
        private String longitude;
        private String event;

        public void setId(String newId) {
            id = newId;
        }

        public String getId() {
            return id;
        }

        public void setTitle(String newTitle) {
            title = newTitle;
        }

        public String getTitle() {
            return title;
        }

        public void setDescription(String newDescription) {
            description = newDescription;
        }

        public String getDescription() {
            return description;
        }

        public void setTime(String newTime) {
            time = newTime;
        }

        public String getTime() {
            return time;
        }

        public void setTags(String newTags) {
            tags = newTags;
        }

        public String getTags() {
            return tags;
        }

        public void setLatitude(String newLatitude) {
            latitude = newLatitude;
        }

        public String getLatitude() {
            return latitude;
        }

        public void setLongitude(String newLongitude) {
            longitude = newLongitude;
        }

        public String getLongitude() {
            return longitude;
        }

        public void setEvent(String newEvent) {
            event = newEvent;
        }

        public String getEvent() {
            return event;
        }
    }

    public static class Centroid {
        private final String event;
        private String title;
        private String description;

        private String tags;

        private Integer time;
        private int nbTimeValues = 0; // needed to calculate the average later

        private Float latitude;
        private int nbLatitudeValues = 0; // needed to calculate the average later
        private Float longitude;
        private int nbLongitudeValues = 0; // needed to calculate the average later

        public Centroid(FlickrDoc flickrDoc) {
            event = flickrDoc.event;
            title = flickrDoc.title;
            description = flickrDoc.description;
            tags = flickrDoc.tags;
            if (flickrDoc.time != null) {
                time = Integer.valueOf(flickrDoc.time.trim());
                nbTimeValues = 1; // time is the sum of one value
            }            
            if (flickrDoc.latitude != null) {
                latitude = Float.valueOf(flickrDoc.latitude.trim());
                nbLatitudeValues = 1; // latitude is the sum of one value
            }
            if (flickrDoc.longitude != null) {
                longitude = Float.valueOf(flickrDoc.longitude.trim());
                nbLongitudeValues = 1; // longitude is the sum of one value
            }
        }

        public void update(FlickrDoc newData) {
            title = title + " " + newData.title;
            description = description + " " + newData.description;
            tags = tags + " " + newData.tags;
            if (newData.time != null) {
                nbTimeValues++;
                if (time == null) {
                    time = 0;
                }
                time += Integer.valueOf(newData.time.trim());
            }
            if (newData.latitude != null) {
                nbLatitudeValues++;
                if (latitude == null) {
                    latitude = 0F;
                }
                latitude += Float.valueOf(newData.latitude.trim());
            }
            if (newData.longitude != null) {
                nbLongitudeValues++;
                if (longitude == null) {
                    longitude = 0F;
                }
                longitude += Float.valueOf(newData.longitude.trim());
            }
        }

        public String getTitle() {
            return title;
        }

        public String getDescription() {
            return description;
        }

        public String getTime() {
            if (nbTimeValues == 0) {
                return null;
            } else {
                return Integer.toString(time / nbTimeValues);
            }
        }

        public String getTags() {
            return tags;
        }

        public String getLat() {
            if (nbLatitudeValues == 0) {
                return null;
            } else {
                return Float.toString(latitude / nbLatitudeValues);
            }
        }

        public String getLng() {
            if (nbLongitudeValues == 0) {
                return null;
            } else {
                return Float.toString(longitude / nbLongitudeValues);
            }
        }

        public String getEvent() {
            return event;
        }
    }
}

Answer 5

回答“如何调试”这个问题

首先收集帖子中遗漏的信息。可能有助于未来人们遇到同样问题的信息。

首先，完整的堆栈跟踪。从XML解析器中抛出的内存不足异常与从代码中抛出的异常非常不同。

其次，XML文件的大小，因为“不长时间”完全没用。是1K，1M还是1G？有多少元素。

第三，你是如何解析的？ SAX，DOM，StAX，完全不同的东西？

第四，你是如何使用这些数据的。您正在处理一个文件还是多个文件？解析后是否意外地保留了数据？代码示例在这里会有所帮助（并且指向某些第三方网站的链接对未来的SO用户来说并不是非常有用）。

Answer 6

好的，我会承认我正在避免直接提出问题。您可能需要考虑使用XStream进行解析，而不是让它以较少的代码处理大部分工作。下面我粗略的例子用64MB堆解析你的XML。请注意，它还需要Apache Commons IO才能轻松读取输入，以便让hack将<collection>转换为<list>。

import java.io.File;
import java.io.IOException;
import java.util.List;

import org.apache.commons.io.FileUtils;

import com.thoughtworks.xstream.XStream;
import com.thoughtworks.xstream.annotations.XStreamAlias;

public class CentroidGenerator {
    public static void main(String[] args) throws IOException {
        for (Centroid centroid : getCentroids(new File("PjrE.data.xml"))) {
            System.out.println(centroid.title + " - " + centroid.description);
        }
    }

    @SuppressWarnings("unchecked")
    public static List<Centroid> getCentroids(File file) throws IOException {
        String input = FileUtils.readFileToString(file, "UTF-8");
        input = input.replaceAll("collection>", "list>");

        XStream xstream = new XStream();
        xstream.processAnnotations(Centroid.class);

        Object output = xstream.fromXML(input);
        return (List<Centroid>) output;
    }

    @XStreamAlias("doc")
    @SuppressWarnings("unused")
    public static class Centroid {
        private String id;
        private String title;
        private String description;
        private String time;
        private String tags;
        private String latitude;
        private String longitude;
        private String event;
        private String geo;
    }
}

Answer 7

我下载了你的代码，这是我几乎从未做过的事情。我可以肯定地说99％的错误是在你的代码中：循环中的错误“if”。它与Digester或XML没有任何关系。要么你犯了一个逻辑错误，要么你没有完全考虑你创造了多少个对象。

但是猜猜：我不会告诉你你的bug是什么。

如果你无法从我上面给出的一些提示中弄清楚，那太糟糕了。通过提供足够的信息（原始帖子中的）来实际开始调试，这与你将所有其他受访者放在一起的情况相同。

也许你应该阅读 - 实际阅读 - 我以前的帖子，并用它要求的信息更新你的问题。或者，如果您不愿意这样做，请接受您的F.

OutOfMemoryError异常：Java堆空间，如何调试......？

7 个答案: