Question

我想实现一个读取文件的程序（即.txt）并将文件保存在一个数组中（我已经这样做了）。然后我想要一个二维数组，我只保存每一行的单词。

例如，如果文件包含两行，我想在array[0][0]中的第一行中的第一行中有两行，而array[0][1]中的第二行中有两行，等等

我有以下代码：

for (int i=0; i < aryLines.length; i++) {
    String[] channels = aryLines[i].split(" ");

    System.out.println("line " + (i+1) + ": ");

    for (int j=0; j < channels.length; j++){
        System.out.println("word " + (j+1) + ": ");
        System.out.println(channels[j]);
    }

    System.out.println();
}

其中aryLines包含所有行，但我没有找到执行我所描述的解决方案。

Answer 1

让你的1-D数组： -

String[] lines = new String[10];

首先需要声明一个数组数组： -

String[][] words = new String[lines.length][];

然后迭代它，对于每一行，将其拆分并将其分配给内部数组： -

for (int i = 0; i < words.length; i++) {
    words[i] = lines[i].split("\\s+");
}

现在，问题将是，并非所有单词都被space分隔开来。他们还有许多需要考虑的标点符号。我会留给你把它分成所有的标点符号。

例如： -

"This line: - has word separated by, : and -"

现在，您需要找到句子中使用的所有标点符号。

您可以做的一件事是使用Regex来匹配单词模式，如果您不确定所有punctuation在您的行中使用的是什么。并将每个匹配的单词添加到arraylist。

"\\w+"  // this regex will match one or more characters forming words

让我们看看它在上面的例子中起作用： -

    String str = "This line: - has word separated by, : and -";
    List<String> words = new ArrayList<String>();

    Matcher matcher = Pattern.compile("\\w+").matcher(str);

    while (matcher.find()) {
        words.add(matcher.group());
    }

    System.out.println(words);

输出： -

[This, line, has, word, separated, by, and]

您可以在我发布的上述循环中使用此方法。

拆分数组中的.text文件

1 个答案: