Java txt解析器新行问题

时间:2017-09-01 12:20:18

标签: java parsing

早上好。使用split方法解析器时遇到问题。目标是读取txt文件,解压缩语句,然后用那些应该声明写一个新的txt文件。当文本在一条连续线上时,我有它工作。如果我在txt文件中有一个新行,则只用最后一行重写该文件。可能是我的循环结构?还有任何建议从打开它的目录中保存新文件?谢谢

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;

import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

import javax.swing.JFileChooser;
import javax.swing.JOptionPane;

/*This Program Will launch a File Explorer.
User will then chose a .txt file to be parsed.
A new file will be created labeled "Parsed_(Document Name)".*/

public class Parser {

    @SuppressWarnings("resource")
    public static void main(String[] args) {

        JFileChooser chooser = new JFileChooser();
        Scanner userFile = new Scanner(System.in);

        int returnVal = chooser.showOpenDialog(null);
        if (returnVal == JFileChooser.APPROVE_OPTION) {

            try {
                System.out.println("You chose to open this file: " + chooser.getSelectedFile().getName() + "\n");

                File file = new File(chooser.getSelectedFile().getName());
                String newFile = ("Parsed_" + file);

                userFile = new Scanner(file);

                while (userFile.hasNextLine()) {

                    String document = userFile.nextLine();
                    // Line breaks used by Parser
                    String[] sentences = document.split("\\.|\\?|\\!|\\r");

                    List<String> ShouldArray = new ArrayList<String>();

                    for (String shouldStatements : sentences) {

                        if (shouldStatements.contains("Should") || shouldStatements.contains("should"))
                            ShouldArray.add(shouldStatements);

                    }

                    FileWriter writer = new FileWriter(newFile);
                    BufferedWriter bw = new BufferedWriter(writer);

                    for (String shallStatements : ShouldArray) {

                        System.out.println(shallStatements);

                        bw.append(shallStatements);
                        bw.newLine();

                    }

                    System.out.println("\nParsed Document Created: " + newFile);

                    JOptionPane.showMessageDialog(null, "Parsed Document Created: " + newFile);
                    bw.close();

                    writer.close();

                }

                userFile.close();

            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }
    }
}

测试文件1(有效!)

大家好。这是一个装箱单。你应该有一把牙刷。你应该有一个手机充电器。而你绝对应该有你的钱包!

测试文件1输出:

你应该有一把牙刷  你应该有一个手机充电器  而你绝对应该有你的钱包

测试文件2(仅打印最后一行)

大家好。这是一个装箱单。你应该有一把牙刷。你应该有一个手机充电器。 这是一些随机文本,显示解析器不包含此内容 你肯定应该有你的钱包!

测试文件2输出:

你一定要带钱包

3 个答案:

答案 0 :(得分:2)

您需要在循环外部创建结果数组

 /** Placed here**/
 List<String> ShouldArray = new ArrayList<String>();
 while (userFile.hasNextLine()) {

                String document = userFile.nextLine();
                // Line breaks used by Parser
                String[] sentences = document.split("\\.|\\?|\\!|\\r");

                /** REMOVED HERE **/

                for (String shouldStatements : sentences) {

                    if (shouldStatements.contains("Should") || shouldStatements.contains("should"))
                        ShouldArray.add(shouldStatements);

                }
               ......

否则你只会收集上一次循环的结果。

基本上你的代码在做什么:

cut up file in lines
take each line
    take next line
     make a result board.
     write results on board
    take next line
     erase board
     write results on board
    take next line
     erase board
     write results on board

然后最后在你的主板上只有一个有限的结果集

答案 1 :(得分:1)

你在循环中覆盖你的Arraylist,但是你实际上并不需要它

%pathString%

答案 2 :(得分:1)

我重构了代码,删除了“ShouldArray”,这是不需要的。

伪代码

While there are lines to read in the In file
    Read each line
    Split each line into Array of sentences

    Loop through each sentence
        If each sentence contains Should or should Then
          Write sentence to Out file
        End If
    End Loop
End While

Close Out file
Close In file

以下代码适用于:

多行:

Hello all. Here is a a packing list.
You Should have a toothbrush. You Should have a Phone charger.
Here is some random text to show the parser will not include this.
You definitely should have your wallet!

单行:

Hello all. Here is a a packing list. You Should have a toothbrush. You should have a Phone charger. And you definitely should have your wallet!

import java.util.Scanner;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
import java.io.File;

public class ShouldStringsParser {

    public ShouldStringsParser(String inFile, String outFile) throws IOException {
        File file = new File(inFile);
        FileWriter writer = new FileWriter(outFile);
        BufferedWriter bw = new BufferedWriter(writer);
        Scanner userFile;
        userFile = new Scanner(file);
        String[] sentences;

        while (userFile.hasNextLine()) {
            String line = userFile.nextLine();
            System.out.println(line);

            sentences = line.split("\\.|\\?|\\!|\\r");

            for (String shouldStatements : sentences) {
                if (shouldStatements.contains("Should") || shouldStatements.contains("should")) {
                    System.out.println(">>>" + shouldStatements);
                    bw.append(shouldStatements);
                    bw.newLine();
                }
            }
        }

        bw.close();
        writer.close();
        userFile.close();
    }

    public static void main(String[] args) {
        try {
            new ShouldStringsParser("inDataMultiLine.txt", "outDataMultiLine.txt");

            new ShouldStringsParser("inDataSingleLine.txt", "outDataSingleLine.txt");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}