Apache camel:通过日志

时间:2017-02-03 06:47:43

标签: java spring apache-camel

我有一个简单的日志示例:

2017-02-02 09:58:12,764 - INFO - PRC0XK - logged in
2017-02-02 09:58:13,766 - INFO - L3J5WW - logged in
2017-02-02 09:58:14,005 - INFO - 0NKCVZ - call s2
2017-02-02 09:58:14,767 - INFO - P0QIOW - logged in
2017-02-02 09:58:15,729 - INFO - E0MVFZ - call s2
2017-02-02 09:58:16,257 - INFO - L3J5WW - call s2
2017-02-02 09:58:17,750 - INFO - PRC0XK - call s2
2017-02-02 09:58:21,908 - INFO - P0QIOW - call s2
2017-02-02 09:58:30,479 - INFO - PRC0XK - get answer from s2
2017-02-02 09:58:30,479 - INFO - PRC0XK - logged out

"{timestamp} - {LogLevel} - {USERID} - {Action}"等字段组成。 我希望将它用作输入并通过USERID逐个形成动作。 稍后,我希望添加另一个以相同方式形成的日志文件,它也具有简单的修改USERID,并通过USERID通过两个日志收集所有操作。 我尝试使用聚合策略,但我有一些我没想到的。 我的骆驼路线是:

<route id="fileeater">
<description>
    this route will eat log file and try to put guid through lot of log entry by some identifier
</description>
<from uri="file://data/in?charset=utf-8"/>
<split streaming="true">
    <tokenize token="\n"/>
    <to uri="log:gotlogline"/>
    <aggregate strategyRef="SimpleAggregationStrategy" completionSize="4">
        <correlationExpression>
          <constant>true</constant>
        </correlationExpression>
        <log logName="LOGEater" message="this is logeater part"/>
        <to uri="file://data/out"/>
    </aggregate>
</split>

其中SimpleAggregationStrategy是:

import org.apache.camel.Exchange;
import org.apache.camel.processor.aggregate.AggregationStrategy;

public class SimpleAggregationStrategy implements AggregationStrategy{

@Override
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {

    if(oldExchange == null) {
        return newExchange;
    }

    String oldBody = oldExchange.getIn().getBody(String.class);
    String newBody = newExchange.getIn().getBody(String.class);
    String body= oldBody;
    if (oldBody.split(" - ")[2].equalsIgnoreCase(newBody.split(" - ")[2])){
        body = oldBody + "\n" + newBody;
    }

    oldExchange.getIn().setBody(body);

    return oldExchange;
}

}

因此,我希望记录条目并按USERID分组:

...
2017-02-02 09:59:45,599 - INFO - NU7444 - logged in 
2017-02-02 09:59:51,229 - INFO - NU7444 - call s2
2017-02-02 10:00:09,818 - INFO - NU7444 - get answer from s2
2017-02-02 10:00:09,818 - INFO - NU7444 - logged out
...

但我在outfile中只有两行:

2017-02-02 10:00:09,818 - INFO - NU7444 - get answer from s2
2017-02-02 10:00:09,818 - INFO - NU7444 - logged out

我的想法是关于聚合中的correlationExpression:

  1. 我可以使用部分日志行(拆分(&#34; - &#34;)[2]作为USERID)通过聚合将它们绑定在一起吗?

  2. 我读了http://www.catify.com/2012/07/09/parsing-large-files-with-apache-camel/,发现按标头聚合比简单聚合更快。那么,我可以在拆分后使用行的一部分作为标题,然后通过标题收集它吗?我应该使用处理器来获取部分行(USERID)并将其放入标题中吗?

1 个答案:

答案 0 :(得分:0)

嗯,伙计们。好像我在玩骆驼后找到了一个解决方案。 关于使用一个可以为每个日志条目设置标题的进程,就像我在评论中提到的那样:

public class UserIDProcessor implements Processor{
    public void process(Exchange exchange) throws Exception {
        String input = exchange.getIn().getBody(String.class);
        if (input.split(" - ").length > 2){
            exchange.getIn().setHeader("LOGLEVEL", input.split(" - ")[1]);
            exchange.getIn().setHeader("USERID", input.split(" - ")[2]);
        }
        exchange.getIn().setBody(input);
    }
}

然后,我使用简单的aggrstrategy来汇总消息:

public class SimpleAggregationStrategy implements AggregationStrategy{
    @Override
    public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
        if(oldExchange == null) {
            return newExchange;
        }
        String oldBody = oldExchange.getIn().getBody(String.class);
        String newBody = newExchange.getIn().getBody(String.class);
        String body= oldBody + "\r\n" + newBody;
        oldExchange.getIn().setBody(body);
        return oldExchange;
    }
}

使用非常简单的路线(您可以根据需要在聚合路线部分添加超时标志和完成大小):

<route id="fileeater">
    <description>
        this route will eat log file and try to put guid through lot of log entry by some identifier
    </description>
    <from uri="file://data/in?charset=utf-8&amp;delete=false&amp;readLock=idempotent-changed&amp;readLockCheckInterval=5000"/>
    <split streaming="true">
        <tokenize token="\n"/>
        <process ref="UIDProcessor"/>
        <aggregate strategyRef="SimpleAggregationStrategy" completionSize="4">
            <correlationExpression>
              <simple>header.USERID</simple>
            </correlationExpression>
            <to uri="log:gotlogline"/>
            <to uri="file://data/out?fileExist=append"/>
        </aggregate>
    </split>
</route>

此外,为了提高解析速度,您可以添加parallelProcessing="true"标记进行拆分,并获得非常快速的结果。