如何防止while循环中的代码重复进行分块数据处理?

时间:2017-08-17 13:23:16

标签: java

我正在读取文件,收集处理过的行,并在每个收集块之后批量写入(例如,文件或数据库)。

当循环终止时(=完全读取文件),我必须再次调用该编写器。否则我不会抓住最后一块。

问题:我可以以某种方式改进代码以防止重复进行额外的write()调用吗?

List<String> collect = new ArrayList<>();

String line;
while ((line = reader.read()) != null) {
    String processed = processline(line);
    collect.add(processed);

    //write each x chunks to file
    if (collect.size() % 1000 == 0) {
        writer.write(collect);
        collect = new ArrayList<>();
    }
}

//can I prevent repetition here?
if (!collect.isEmpty()) {
    writer.write(collect);
}

5 个答案:

答案 0 :(得分:5)

在一个单独的类中封装缓冲逻辑(因为这就是你正在做的事情,缓冲)。但是当你完成阅读时,你总是必须在缓冲区太大时写。

class BufferingWriter implements Closeable {
    private List<String> buffer = new ArrayList<>(1000);
    private MyWriter writer;

    public void write(String line) {
        buffer.add(line);
        if (buffer.size() >= 1000) {
            flush();
        }
    }

    public void flush() {
        writer.write(buffer);
        buffer.clear();
    }

    @Override
    public void close() throws IOException {
        flush();
        // TBD: Pass the close call onto MyWriter if that is possible
        // or otherwise flag this writer as closed
    }
}
List<String> collect = new ArrayList<>();
try (BufferingWriter bwriter = new BufferingWriter(writer)) {
    String line;
    while ((line = reader.read()) != null) {
        String processed = processline(line);
        bwriter.write(line);
    }
}

答案 1 :(得分:1)

以下是我的建议:

    List<String> collect = new ArrayList<>();

    String line = reader.read();
    String nextLine;

    while (line != null) {

        nextLine = reader.read();

        String processed = processline(line);
        collect.add(processed);

        //write each x chunks to file
        if (collect.size() % 1000 == 0 || nextLine == null) {
            writer.write(collect);
            collect = new ArrayList<>();
            line = nextLine;
            continue;
        }

        line = nextLine;     

    }

答案 2 :(得分:0)

do-while循环怎么样?

List<String> collect = new ArrayList<>();

String line;
do {
   line = reader.read();
   if(line != null) {
      String processed = processline(line);
      collect.add(processed);
   }

   if (collect.size() % 1000 == 0 
       || (line == null && !collect.isEmpty())) { // end of file
      writer.write(collect);
      collect = new ArrayList<>();
   } 

} while(line != null);

答案 3 :(得分:0)

也许是这样的?不确定这是一个改进......

List<String> collect = new ArrayList<>();

for (;;) {
    String line = reader.read();
    boolean done = line == null;
    if (!done) {
        String processed = processline(line);
        collect.add(processed);
    }

    //write each x chunks to file
    if ((collect.size() % 1000 == 0) || done) {
        if (!collect.isEmpty()) {
            writer.write(collect);
        }
        if (done) {
            break;
        }
        collect = new ArrayList<>();
    }
}

答案 4 :(得分:0)

你可以这样做,但你仍然会有一条额外的线来获得最后一块。

List<String> collect = new ArrayList<>();

String line;
do {
   line = reader.read();
   if(line != null) {
        String processed = processline(line);
        collect.add(processed);

       //write each x chunks to file
       if (collect.size() % 1000 == 0) {
            writer.write(collect);
            collect = new ArrayList<>();
       }
   }else{
       writer.write(collect);
   }
} while(line != null);