Question

我有一些效率问题。我正在开发一个企业应用程序，它作为EAR存档部署在jboss EAP 6.1服务器上。我基于while循环中的实体创建新对象并将它们写入文件。我获得了有限数量的这些实体（在EJB DAO的帮助下）（例如每个步骤2000个）。问题是我需要处理数百万个对象，并且第一百万个进行得相当顺利，但是进一步的循环越慢就越有效。任何人都可以告诉我为什么这个循环进展时工作越来越慢？如何让它长时间顺利运行？以下是代码的一些关键部分：

    public void createFullIndex(int stepSize) {
       int logsNumber = systemLogDao.getSystemLogsNumber();
       int counter = 0;
       while (counter < logsNumber) {
           for (SystemLogEntity systemLogEntity : systemLogDao.getLimitedSystemLogs(counter, stepSize)) {
               addDocument(systemLogEntity);
           }
           counter = counter + stepSize;
       }
       commitIndex();
    }

    public void addDocument(SystemLogEntity systemLogEntity) {
       try {
        Document document = new Document();
        document.add(new NumericField("id", Field.Store.YES, true).setIntValue(systemLogEntity.getId()));
        document.add(new Field("resource", (systemLogEntity.getResource() == null ? "" : systemLogEntity
                .getResource().getResourceCode()), Field.Store.YES, Field.Index.ANALYZED));
        document.add(new Field("operationType", (systemLogEntity.getOperationType() == null ? "" : systemLogEntity
        document.add(new Field("comment",
                (systemLogEntity.getComment() == null ? "" : systemLogEntity.getComment()), Field.Store.YES,
                Field.Index.ANALYZED));
        indexWriter.addDocument(document);
       } catch (CorruptIndexException e) {
           LOGGER.error("Failed to add the following log to Lucene index:\n" + systemLogEntity.toString(), e);
       } catch (IOException e) {
           LOGGER.error("Failed to add the following log to Lucene index:\n" + systemLogEntity.toString(), e);
       }
    }

感谢您的帮助！

Answer 1

据我所知，就你所知，你不会把你的东西写成文件。相反，您尝试创建完整的DOM对象，然后将其刷新到文件。此策略适用于有限数量的对象。在您需要处理数百万个（如您所说）的情况下，您不应该使用DOM。相反，您应该能够在接收数据时创建XML片段并将其写入文件。这将减少您的内存消耗，并希望提高性能。

Answer 2

我会尝试重新使用Document对象。我有垃圾收集的循环问题，我的循环太快，gc无法合理地跟上，重用对象解决了我的所有问题。我没有尝试过亲自重复使用Document对象，但是如果可能的话，它可能适合你。

Answer 3

记录应该很容易。使用附加到文本的Guava看起来像：

File to = new File("C:/Logs/log.txt");
CharSequence from = "Your data as string\n";
Files.append(from, to, Charsets.UTF_8);

我的笔记很少：

我不确定您的日志实体是否是垃圾回收
目前尚不清楚文件内容是否保存在内存中
如果log是xml格式，那么在添加新元素时可能需要解析整个XML DOM

Lucene在while循环中创建文档的速度越来越慢

3 个答案: