Java Regex尝试从文本文件中读取子字符串

时间:2012-11-24 02:33:33

标签: java text

我试图从一个充满垃圾的文本中读取某些有用的信息,但它正在阅读一些传感器。我从这段代码中丢失了很多有用的信息

像'NH3级别'一样 我是:9.9977'

必须有一些比这更有效的东西,有人可以帮助我吗

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class sensorclean {
final static String Array[] = new String[1000];
    static int g = 0;

public static void main(String[] args) {

File file = new File("C:/Users/Omar/Desktop/datatest.txt");
try {
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
String newline=ignoreComments(line);
if(newline!=null)
    Array[g] = newline;
    g++;
}

for(int i =0; i<Array.length;i++){
    if(Array[i]!=null)

    {

        Array[i] = Array[i].trim();

    System.out.println(Array[i]);

    }
}

}

 catch (FileNotFoundException e) 
{
e.printStackTrace();
}
}
private static String ignoreComments(String line) {
String result_line=null;

int upto=line.indexOf('#');
int upto1 = line.indexOf('ë');
int upto2 = line.indexOf('~');
int upto3 = line.indexOf('€');
int upto4 = line.indexOf('?');
if((upto!=0&&upto>0)&&(upto1!=0&&upto1>0)){
result_line=line.substring(1, upto4);
System.out.println("here");
}
else{
if(upto<0 && upto1<0 && upto2<0 && upto3<0 ){
result_line=line;
}/*else{
result_line="";
}*/
}
return result_line;

}
}

我的传感器读取温度并将其存储在.txt文件中。但它增加了垃圾 我正在研究一种从中提取有用信息的java

这是文件

中的示例
  

E〜^€} 3¢@ iw4R#} 3¢@ IW               CO水平为:101.0831,CO2水平为:375.2046,NH3水平   l是:9.9977?~O€3¢@ isGR#                                  -mac:0013A20040691673,-time:星期三,12/11/14 -    14:06.56,E~G€3¢@ isGR#} 3¢@是                                       温度:51.9354,湿度为:9.6129,蝙蝠:63%

     

〜_€} 3¢@ isGR#} 3¢@是                            CO水平为:106.1330,CO2水平为:374.7616,NH3水平   l是:86.7625?~O€3¢@ if:R#                                   -mac:0013A20040691666,-time:星期三,12/11/14    - 14:09.20,é~I€3¢@ if:R#} 3¢@ if                                        温度:280.0000,湿度为:17.7677,蝙蝠:   96%   我〜^€} 3¢@如果,R#} 3¢@如果                            CO水平为:128.8912,CO2水平为:375.6922,NH3水平   l是:9.9977 E~O€3¢@ iw2R#                                  -mac:0013A20040691677,-time:星期三,12/11/14 -    14:12.11,?~H€} 3¢@ iw4R#} 3¢@ iw                                       温度:20.3225,湿度为:19.3161,蝙蝠:87   %   ?〜^€} 3¢@ iw1R#} 3¢@ IW                            CO水平为:101.0831,CO2水平为:375.1160,NH3水平   我是:9.9977?   

这种垃圾并不总是与它的变化相同

这是我当前的代码

1 个答案:

答案 0 :(得分:0)

它不是基于正则表达式的,仍然是:

public static void main(String[] args) throws Exception {
    StringBuilder tmp = new StringBuilder();
    StringBuilder res = new StringBuilder();
    Scanner sc = new Scanner(new File("test.txt"));
    while (sc.hasNextLine()) {
        String line = sc.nextLine();
        boolean isRubbish = false;
        for (char c : line.toCharArray()) {
            if (c == ' ') {
                if (!isRubbish) {
                    res.append(tmp).append(' ');
                } else {
                    isRubbish = false;
                }
                tmp.setLength(0);
            } else if (isRubbish(c)) {
                isRubbish = true;
            } else {
                tmp.append(c);
            }
        }
    }
    System.out.println(res);
}

private static boolean isRubbish(char c) {
    return "#^}@?".indexOf(c) > -1;
}