计算行数,标记化

时间:2012-06-23 09:40:23

标签: java count tokenize

我想计算文件中的行数,将其拆分为标记。我似乎无法让我的代码工作,有人可以提供建议。提前致谢

import java.util.*;
import java.io.*;

public class kup
{
    public static void main(String args[]) throws Exception
    {
        FileReader fileInput = new FileReader("C:\\save\\input.txt");
        BufferedReader readInput = new BufferedReader(fileInput);

        FileWriter fileOutput = new FileWriter("C:\\save\\output.txt");
        PrintWriter outFile = new PrintWriter(fileOutput);

        Scanner scanLine = new Scanner(readInput);
        String textInput = scanLine.nextLine();
        StringTokenizer stringtokenizer = new StringTokenizer(textInput);

        int tokenCount = stringtokenizer.countTokens();
        int lineCount = 0;

        while(scanLine.hasNextLine())
        {
            while(stringtokenizer.hasMoreTokens())
            {
                String string = stringtokenizer.nextToken();
                outFile.println(string);
            }
                lineCount++;
        }

        outFile.println("Number of words: " +tokenCount);
        outFile.println("Number of lines: " +lineCount);

        readInput.close();
        outFile.close();
    }

}

1 个答案:

答案 0 :(得分:5)

您只是使用第一行初始化标记生成器。我怀疑你想要的东西:

int tokenCount = 0;
int lineCount = 0;

while (scanLine.hasNextLine())
{
    String line = scanLine.nextLine();
    StringTokenizer tokenizer = new StringTokenizer(line);

    while (tokenizer.hasMoreTokens())
    {
        String string = tokenizer.nextToken();
        outFile.println(string);
        tokenCount++;
    }
}

就我个人而言,我可能会坚持使用BufferedReader来“逐行阅读” - 你并没有真正使用Scanner的任何细节。所以:

String line;
while ((line = readInput.readLine()) != null)
{
    StringTokenizer tokenizer = new StringTokenizer(line);

    while (tokenizer.hasMoreTokens())
    {
        String string = tokenizer.nextToken();
        outFile.println(string);
        tokenCount++;
    }
}

请注意,您应该关闭最终的读取器/编写器/流,或者如果您使用的是Java 7,则使用try-with-resources语句。