从文件中读取具有特定特征的行,排除所有其他行

时间:2014-01-16 14:38:53

标签: java file-io

我正在尝试编写方法来从文件中读取具有某些规范的行。例如,

我的文本文件包含以下内容: -

12-01-01 13:26 San Jose 12.99 DVD
12-12-30 09:40 Miami 13.50 Music
14-08-30 10:20 Arizona 16.03 Scientist
11-07-10 09:10 New York 25.00 ColdPlay
14-08-30 10:20 Arizona 18.04 MeetYou
14-08-30 10:20 Arizona 50.03 Scientist
11-07-10 09:30 New York 25.00 ColdPlay
11-07-10 09:20 New York 25.00 ColdPlay

制表符分隔值,对于不同的列,这些是我想要方法读取的行。 现在假设如果有,如下所示,甚至输入

12-01-01 13:26 San Jose 12.99 DVD
12-12-30 09:40 Miami 13.50 Music
14-08-30 10:20 Arizona 16.03 Scientist
11-07-10 09:10 New York 25.00 ColdPlay
14-08-30 10:20 Arizona 18.04 MeetYou
[new lines]
14-08-30 10:20 Arizona 50.03 Scientist
11-07-10 09:30 New York 25.00 ColdPlay
//This line should not be read
even this should not be read #$%^&
11-07-10 09:20 New York 25.00 ColdPlay

该特定行应该被转义。到目前为止,我已经完成了文件格式正确,如下所示: -

public static void main(String[] args) {
     BufferedReader br = null;
     String temp = null;
     List<String> arrayRead = new ArrayList<String>();
     try{
         br = new BufferedReader(new FileReader("D:\\testing\\SalesData.txt"));
         while((temp=br.readLine())!= null){
             arrayRead.add(temp);
         }
         int n = arrayRead.size();
         System.out.println("No. of Records in file "+n);
        //Add arrayList data to String Array
         String[] linesToRead = arrayRead.toArray(new String[arrayRead.size()]);

         String[] lineX = null;
         Hashtable<String, String> dataReq = new Hashtable<String, String>();
         for(int i=0; i<arrayRead.size(); i++){
             lineX = linesToRead[i].split("\\t");
             dataReq.put(lineX[2], lineX[3]);
         }

     }
     catch(FileNotFoundException f){
         f.printStackTrace();
     }
     catch(IOException e){
         e.printStackTrace();
     }
     finally{
         if(br!= null){
             try {
                br.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
         }
     }
}

1 个答案:

答案 0 :(得分:0)

为什么不regex?可能有用。

import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.ArrayList;
import java.util.Scanner;
import java.io.IOException;
import java.io.File;
import java.io.FileInputStream;

public class MyLineReader {

    public static void main(String[] args) {
        File inputFile = new File("myfile.txt");

        // Create pattern object.
        String pattern = "^(\\d{2}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2})\\s([a-zA-Z\\s]*)\\s(\\d*\\.?\\d*)\\s(\\w*)$";
        ArrayList<String[]> collectedLines = new ArrayList<String[]>();
        Pattern r = Pattern.compile(pattern);

        // Match those.
        Matcher m;
        FileInputStream fis = null;

        try{
            fis = new FileInputStream(inputFile);
            Scanner fileScanner = new Scanner(fis);
            String line;
            String[] row;

            while (fileScanner.hasNextLine()){
                line = fileScanner.nextLine();
                m = r.matcher(line);

                if (m.find()) {

                    // Regex have groups.
                    row = new String[] { m.group(1), m.group(2), m.group(3), m.group(4) };

                    collectedLines.add ( row );
                        System.out.println ( String.format("Date: %s, Name: %s, Decimal: %s, Last: %s", row[0], row[1], row[2], row[3]));
                    }
                }
        }catch(IOException ex){
            System.err.println(ex.getMessage());
        }finally {
            if (fis != null){
                try{
                    fis.close();
                }catch(Exception ex){
                }
            }
        }

    }
}

使用的正则表达式如下,您可以在线查看正则表达式here

String pattern = "^(\\d{2}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2})\\s([a-zA-Z\\s]*)\\s(\\d*\\.?\\d*)\\s(\\w*)$";