使用制表符分隔值处理文本文件中的分隔符

时间:2016-03-23 16:10:58

标签: java delimiter

我的输入文件(my.txt)具有以下格式(制表符分隔值):

"0" "0" "231"   "1193"

"0" "0" "74"    "457"

"0" "0" "530"   "387"

"0" "0" "1221"  "641"

"0" "0" "328"   "428"

"0" "0" "228"   "979"   

我已编写以下代码来阅读此输入。但是,分隔符是个问题。有没有一种方法在java中我可以忽略分隔符并只取值?

try {
    FileReader reader = new FileReader("/home/brina/Desktop/my.txt");
    BufferedReader brReader = new BufferedReader(reader);

    String line;
    while ((line = brReader.readLine()) != null) {
        String[] data = line.split("\t");
        if ((Integer.parseInt(data[2]) > 200) && (Integer.parseInt(data[3]) > 1000)) {
            System.out.println("\tYes");
        } else {
            System.out.println("\tNo");
        }

    }
    brReader.close();
} catch (final FileNotFoundException e) {
    e.printStackTrace();
} catch (final IOException e) {
    e.printStackTrace();
}

4 个答案:

答案 0 :(得分:2)

您可以使用匹配器仅提取整数值,并且它不会影响您在该行上的其他内容。像Smth一样

List<Integer> numbers = new ArrayList<>();
Matcher matcher = Pattern.compile("\\d+").matcher(line);
while (matcher.find()) {
    numbers.add(matcher.group());
}

答案 1 :(得分:1)

我猜你可以使用正则表达式:

Pattern p = Pattern.compile("^\"\\d+\"\\t\"\\d+\"\\t\"(\\d+)\"\\t\"(\\d+)\"$");
while ((line = brReader.readLine()) != null) {
  Matcher m = p.matcher(line);
  if (Integer.valueOf(m.group(1)) > 200 && Integer.valueOf(m.group(2)) > 1000)
  {
    System.out.println("\tYes");
  }
  else
  {
    System.out.println("\tYes");
  }
}

答案 2 :(得分:0)

您将不得不从引号中获取SubString,将其转换为int,然后我建议将其添加到int List中。像这样:

String line; 
while ((line = brReader.readLine()) != null)
{ 
  String[] data = line.split("\t");
  List<int> listInt = new List<int>();
  for (int i = 0; i < data.Length; i++)
  {
    listInt = new List<int>();
    String intOnly = data[i].substring(1, data[i].Length - 1); //adjust these values if needed
    int add = Integer.parseInt(intOnly);
    listInt.Add(add);
  } 
  if ( listInt[2] > 200 && listInt[3] > 1000 ) 
  {
    System.out.println("\tYes"); 
    else 
    {
      System.out.println("\tNo");
    }
  }
}

答案 3 :(得分:0)

使用Commons CSV读取制表符分隔文件非常简单:

final Charset utf = Charset.forName("UTF-8");
final Path path = Paths.get("/home/brina/Desktop/my.txt");

try (CSVParser p = new CSVParser(Files.newBufferedReader(path, utf), CSVFormat.TDF)) {
    for (CSVRecord r : p) {
        int v1 = Integer.parseInt(r.get(2));
        int v2 = Integer.parseInt(r.get(3));
        System.out.println(v1 > 200 && v2 > 1000 ? "\tYes" : "\tNo"); 
    }
} catch (IOException e) {
    // ...
}