使用Java从日志中提取某些模式

时间:2012-09-10 15:17:33

标签: java extract file-read

我想从日志文件中提取一条信息。我使用的模式是node-name和命令的提示。我想提取命令输出的信息并进行比较。考虑样本输出如下

    NodeName > command1

    this is the sample output 

    NodeName > command2  

    this is the sample output

我尝试过以下代码。

public static void searchcommand( String strLineString) 
    {


             String searchFor = "Nodename> command1";
             String endStr = "Nodename";
             String op="";
             int end=0;
              int len = searchFor.length();
              int result = 0;
              if (len > 0) {  
              int start = strLineString.indexOf(searchFor);
              while(start!=-1){
      end = strLineString.indexOf(endStr,start+len);

              if(end!=-1){
                  op=strLineString.substring(start, end);

              }else{
                  op=strLineString.substring(start, strLineString.length());
              }
              String[] arr = op.split("%%%%%%%"); 
              for (String z : arr) {
                  System.out.println(z);
                }

                  start = strLineString.indexOf(searchFor,start+len);


              }

              }



    }

问题是代码太慢而无法提取数据。有没有其他方法可以这样做?

编辑1 它是一个日志文件,我在上面的代码中读了一个字符串。

2 个答案:

答案 0 :(得分:0)

我的建议..

public static void main(String[] args) {
        String log = "NodeName > command1 \n" + "this is the sample output \n"
                + "NodeName > command2 \n" + "this is the sample output";

        String lines[] = log.split("\\r?\\n");
        boolean record = false;
        String statements = "";
        for (int j = 0; j < lines.length; j++) {
            String line = lines[j];         
            if(line.startsWith("NodeName")){

                if(record){
                    //process your statement
                    System.out.println(statements);
                }

                record = !record;
                statements = ""; // Reset statement
                continue;
            }

            if(record){             
                statements += line;
            }
        }
    }

答案 1 :(得分:0)

这是我的建议:

使用正则表达式。这是一个:

    final String input = "    NodeName > command1\n" +
            "\n" +
            "    this is the sample output1 \n" +
            "\n" +
            "    NodeName > command2  \n" +
            "\n" +
            "    this is the sample output2";

    final String regex = ".*?NodeName > command(\\d)(.*?)(?=NodeName|\\z)";

    final Matcher matcher = Pattern.compile(regex, Pattern.DOTALL).matcher(input);

    while(matcher.find()) {
        System.out.println(matcher.group(1));
        System.out.println(matcher.group(2).trim());
    }

输出:

1
this is the sample output1
2
this is the sample output2

所以,打破正则表达式:

首先,它会跳过所有符号,直到找到第一个“NodeName&gt;命令”,后跟一个数字。我们要保留的这个数字,知道哪个命令创建了输出。接下来,我们抓住以下所有符号,直到我们(使用前瞻)找到另一个NodeName或输入的结尾。