FileInputStream只读取文件中的第一个单词

时间:2015-10-11 12:25:39

标签: java eclipse fileinputstream

我想通过令牌读取file.txt文件令牌中的单词,并为每个单词添加一个词性标记并将其写入file2.text文件。 file.txt内容已标记化。所以这是我的代码。

public class PoSTagging {
@SuppressWarnings("resource")
public static void PoStagMethod() throws IOException {

FileInputStream fin= new FileInputStream("C:\\Users\\dell\\Desktop\\file.txt");
DataInputStream in = new DataInputStream(fin);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strline=br.readLine();
System.out.println(strline+"first");

try{
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);

String input = strline;
@SuppressWarnings("deprecation")
ObjectStream<String> lineStream =new PlainTextByLineStream(new StringReader(input));

perfMon.start();
String line;
while ((line = lineStream.read()) != null) {

    String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
    String[] tags = tagger.tag(whitespaceTokenizerLine);

    POSSample sample = new POSSample(whitespaceTokenizerLine, tags);
    System.out.println(sample.toString()+"second");
    //String t=sample.toString();

    FileOutputStream fout=new FileOutputStream("C:\\Users\\dell\\Desktop\\file2.txt");
    //fout.write(t.getBytes());

    perfMon.incrementCounter();
    fout.close();
}
perfMon.stopAndPrintFinalResult();
}
catch (IOException e) {
    e.printStackTrace();
}
}
}

从另一个类调用PoStagMethod()时,只有file.txt文件中的第一个单词会写入file2.txt文件。为什么不读取文件中的其他单词?我的代码出了什么问题?

1 个答案:

答案 0 :(得分:1)

您可以使用BufferedReader逐行阅读POSModel。然后按照您的file2.txt处理每一行,然后使用BufferedWriter将输出写入 POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin")); PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent"); POSTaggerME tagger = new POSTaggerME(model); BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\\Users\\dell\\Desktop\\file2.txt")); BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\dell\\Desktop\\file.txt")); String line = ""; while((line = bufferedReader.readLine()) != null){ String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line); String[] tags = tagger.tag(whitespaceTokenizerLine); // Do your work with your tags and tokenized words bufferedWriter.write(/* the string which is needed to be written to your output */); // for adding new-lines in the output file, uncomment the following line: //bufferedWriter.newLine(); } //Do not forget to flush() and close() the streams after your job is done: bufferedWriter.flush(); bufferedWriter.close(); bufferedReader.close(); 。下面的代码段可能会有所帮助:

    POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
    PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
    POSTaggerME tagger = new POSTaggerME(model);

    BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\\Users\\dell\\Desktop\\file2.txt"));

    BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\dell\\Desktop\\file.txt"));
    String line = "";
    while((line = bufferedReader.readLine()) != null){
        String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
        String[] tags = tagger.tag(whitespaceTokenizerLine);
        for(String word: whitespaceTokenizerLine){

        // Do your work with your tags and tokenized words

        bufferedWriter.write(/* the string which is needed to be written to your output */);
        // for adding new-lines in the output file, uncomment the following line:
        //bufferedWriter.newLine();
        }
    }

    //Do not forget to flush() and close() the streams after your job is done:
    bufferedWriter.flush();
    bufferedWriter.close();
    bufferedReader.close();

如果你可以做到这一点,那么用在Java 1.7中添加的 try-with-resource 替换旧式的try-catch子句来自动关闭资源也不错。

此外,如果您需要在单独的行中编写每个单词及其标签,您可能希望有一个内部循环来写入文件。它将如下所示:

@Primary
@Bean(name="oneOfManyDataSources")
public DataSource dataSource() { ... }

希望这会有所帮助,

祝你好运。