如何逐块读取文件

时间:2018-09-25 13:58:33

标签: java io iterator bufferedreader

我需要处理一个较大的文本文件,并且由于总有几行要获取彼此依赖的信息,因此我想逐块读取文件,而不是仅存储特定功能从上面的几行开始。

每个块的第一行都用一个唯一的符号表示。

是否可以使用某种迭代器,然后检查我的符号是否出现在每一行?我真的没有很好的主意如何处理此问题,因此非常感谢您的帮助。

示例:

a1    $    12    20    namea1
b1    x    12    15    namea1,nameb1
c1    x    13    17    namea1,namec1
d1    x    18    20    namea1,named1
a2    $    36    55    namea2
b2    x    38    40    namea2,nameb2
c2    x    46    54    namea2,namec2

您可以看到带有符号$的行之后的所有行都以某种方式引用了该行,因此数字在距a1行的距离之间,并且始终将名称合并在一起。我认为最好逐个读取这样的文件,而不是逐行读取。

1 个答案:

答案 0 :(得分:0)

我不太确定您所说的“逐块”是什么意思,即使如此,您的文本文件结构似乎也非常适合逐行分析。因此,根据您的文件结构,您可以简单地在一个基本的while循环中对其进行解析。伪代码:

While not end of file
    Read line into a String
    split this String on whatspace, "\\s+" into a String array
    Check the String held by the 2nd item in the String array, item[1] 
    Do action with line (create a certain object) based on this String
end of file

现在,如果其中一个符号代表某种标题,而这就是逐块的意思,那么您要做的就是使用状态相关的处理方法来更改解析策略。您的对象,类似于SAX解析。因此,例如,如果"$"表示一个新的“块”,则创建一个新块,并在while循环内,创建要放入该块的对象,直到遇到一个新的块。

因此,假设文本文件如下所示:

$    12    20    namea1
x    12    15    namea1,nameb1
x    13    17    namea1,namec1
x    18    20    namea1,named1
$    36    55    namea2
x    38    40    namea2,nameb2
x    46    54    namea2,namec2

我假设您显示的第一个符号确实不在文件中

并假设您有一个名为Line的类,如下所示:

public class Line {
    private int x;
    private int y;
    private List<String> names  = new ArrayList<>();

    public Line(int x, int y) {
        this.x = x;
        this.y = y;
    }

    public void addName(String name) {
        names.add(name);
    }

    @Override
    public String toString() {
        return "Line [x=" + x + ", y=" + y + ", names=" + names + "]";
    }

}

还有一个Block类,...

public class Block {
    private String name;
    private int x;
    private int y;
    private List<Line> lines = new ArrayList<>();

    public Block(String name, int x, int y) {
        this.name = name;
        this.x = x;
        this.y = y;
    }

    public void addLine(Line line) {
        lines.add(line);
    }

    @Override
    public String toString() {
        return "Block [name=" + name + ", x=" + x + ", y=" + y + ", lines=" + lines + "]";
    }

}

您可以这样解析它:

Scanner blockScanner = new Scanner(resource);

Block currentBlock = null;
while (blockScanner.hasNextLine()) {
    String line = blockScanner.nextLine();
    String[] tokens = line.split("\\s+");

    // NEW_BLOCK == "$"
    if (tokens[0].equals(NEW_BLOCK)) {
        currentBlock = createBlockFromTokens(tokens);
        blocks.add(currentBlock);
    } else if (currentBlock != null) {
        currentBlock.addLine(createLineFromTokens(tokens));
    }
}

createXxxxFromTokens(tokens)在String数组中创建新行或新块的地方


例如,整个事情作为一个MCVE:

import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

public class ReadBlocks {
    private static final String RESOURCE_PATH = "blocks.txt";
    private static final String NEW_BLOCK = "$";

    public static void main(String[] args) {
        List<Block> blocks = new ArrayList<>();

        InputStream resource = ReadBlocks.class.getResourceAsStream(RESOURCE_PATH);
        Scanner blockScanner = new Scanner(resource);

        Block currentBlock = null;
        while (blockScanner.hasNextLine()) {
            String line = blockScanner.nextLine();
            String[] tokens = line.split("\\s+");
            if (tokens[0].equals(NEW_BLOCK)) {
                currentBlock = createBlockFromTokens(tokens);
                blocks.add(currentBlock);
            } else if (currentBlock != null) {
                currentBlock.addLine(createLineFromTokens(tokens));
            }
        }

        if (blockScanner != null) {
            blockScanner.close();
        }

        for (Block block : blocks) {
            System.out.println(block);
        }
    }

    private static Line createLineFromTokens(String[] tokens) {
        if (tokens.length < 4) {
            // throw exception
        }
        int x = Integer.parseInt(tokens[1]);
        int y = Integer.parseInt(tokens[2]);

        Line line = new Line(x, y);
        String[] names = tokens[3].split(",");
        for (String name : names) {
            line.addName(name);
        }
        return line;
    }

    private static Block createBlockFromTokens(String[] tokens) {
        if (tokens.length < 4) {
            // throw exception
        }
        int x = Integer.parseInt(tokens[1]);
        int y = Integer.parseInt(tokens[2]);
        String name = tokens[3];
        return new Block(name, x, y);
    }
}

class Block {
    private String name;
    private int x;
    private int y;
    private List<Line> lines = new ArrayList<>();

    public Block(String name, int x, int y) {
        this.name = name;
        this.x = x;
        this.y = y;
    }

    public void addLine(Line line) {
        lines.add(line);
    }

    @Override
    public String toString() {
        return "Block [name=" + name + ", x=" + x + ", y=" + y + ", lines=" + lines + "]";
    }

}

class Line {
    private int x;
    private int y;
    private List<String> names = new ArrayList<>();

    public Line(int x, int y) {
        this.x = x;
        this.y = y;
    }

    public void addName(String name) {
        names.add(name);
    }

    @Override
    public String toString() {
        return "Line [x=" + x + ", y=" + y + ", names=" + names + "]";
    }

}