我需要从大型日志文件中提取所有FIX消息,并且可能有超过10000到20000个FIX协议消息。我期望获取的FIX协议消息将以8=FIX
开头,并以|10=
结尾一些CheckSum值,该值可以是任何值,然后是分隔符'|'
。
e.g。 8=FIXT.1.1|9=449|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|10=111|
目前,我正在使用此模式进行正则表达式
Pattern pattern = Pattern.compile("8=FIX(.*?)10=(.*?)|");
但是在上面的模式中,我只能提取消息直到10=
而不是校验和值,而且可能存在某些FIX消息自定义标记类似于8410 = TEST |如下:
8=FIXT.1.1|9=73|35=0|34=560|49=RTNSFIXUAT|8410=TEST|52=20140403-01:50:21|56=TR_UAT_VELOCITY|1128=8|10=206|
在上面,我会得到值
" 8 = FIXT.1.1 | 9 = 73 | 35 = 0 | 34 = 560 | 49 = RTNSFIXUAT | 84" ( X - 我想要完整的消息,直到标签10校验和值206 )
请按以下方式查找日志文件摘要:
>02-04-14 11:38:33.559|QFJ Message Processor|input/REPOFIXInput1|INFO|quickfix.outgoing: 8=FIXT.1.1|9=71|35=0|34=1731|49=TR_UAT_VENDOR|52=20140402-11:38:33.557|56=REPOFIXUAT|10=147|
02-04-14 11:38:34.713|SocketConnectorIoProcessor-1.0|input/REPOFIXInput1|INFO|quickfix.incoming: 8=FIXT.1.1|9=449|35=AE|34=1734|1128=8|49=REPOFIXUAT|56=TR_UAT_VENDOR|52=20140402-11:38:34|552=1|54=2|37=20140402-12:36:48|11=NOREF|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|453=4|448=ZERO|452=3|447=D|448=MBY2|452=1|447=D|448=LMEB|452=16|447=D|448=DOR|452=11|447=D|571=7852455|1003=2 USD|150=F|32=50000000.00|15=GBP|1056=88330000.00|31=1.6666|194=1.6654|195=0.0012|64=20140415|63=B|60=20140402-11:07:33|75=20140402|1057=N|460=4|167=FOR|65=OR|55=GBP/USD|10=111|
02-04-14 11:38:35.004|QFJ Message Processor|input/REPOFIXInput1|INFO|Received FIX application message: 8=FIXT.1.1|9=449|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|10=111|
答案 0 :(得分:1)
如果我理解正确,你想保留:
8=FIX
之后但|10=...
|10=value|
以下是一个例子:
String input = ">02-04-14 11:38:33.559|QFJ Message Processor|input/REPOFIXInput1|INFO|quickfix.outgoing: 8=FIXT.1.1|9=71|35=0|34=1731|49=TR_UAT_VENDOR|52=20140402-11:38:33.557|56=REPOFIXUAT|10=147|\r\n"
+ "02-04-14 11:38:34.713|SocketConnectorIoProcessor-1.0|input/REPOFIXInput1|INFO|quickfix.incoming: 8=FIXT.1.1|9=449|35=AE|34=1734|1128=8|49=REPOFIXUAT|56=TR_UAT_VENDOR|52=20140402-11:38:34|552=1|54=2|37=20140402-12:36:48|11=NOREF|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|453=4|448=ZERO|452=3|447=D|448=MBY2|452=1|447=D|448=LMEB|452=16|447=D|448=DOR|452=11|447=D|571=7852455|1003=2 USD|150=F|32=50000000.00|15=GBP|1056=88330000.00|31=1.6666|194=1.6654|195=0.0012|64=20140415|63=B|60=20140402-11:07:33|75=20140402|1057=N|460=4|167=FOR|65=OR|55=GBP/USD|10=111|\r\n"
+ "02-04-14 11:38:35.004|QFJ Message Processor|input/REPOFIXInput1|INFO|Received FIX application message: 8=FIXT.1.1|9=449|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|10=111|";
Pattern p = Pattern.compile("8=FIX(.+)(?<=\\|)10=(.+?)(?=\\|)", Pattern.MULTILINE);
Matcher m = p.matcher(input);
while (m.find()) {
System.out.println(m.group());
System.out.println("\t" + m.group(1));
System.out.println("\t" + m.group(2));
}
<强>输出强>
8=FIXT.1.1|9=71|35=0|34=1731|49=TR_UAT_VENDOR|52=20140402-11:38:33.557|56=REPOFIXUAT|10=147
T.1.1|9=71|35=0|34=1731|49=TR_UAT_VENDOR|52=20140402-11:38:33.557|56=REPOFIXUAT|
147
8=FIXT.1.1|9=449|35=AE|34=1734|1128=8|49=REPOFIXUAT|56=TR_UAT_VENDOR|52=20140402-11:38:34|552=1|54=2|37=20140402-12:36:48|11=NOREF|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|453=4|448=ZERO|452=3|447=D|448=MBY2|452=1|447=D|448=LMEB|452=16|447=D|448=DOR|452=11|447=D|571=7852455|1003=2 USD|150=F|32=50000000.00|15=GBP|1056=88330000.00|31=1.6666|194=1.6654|195=0.0012|64=20140415|63=B|60=20140402-11:07:33|75=20140402|1057=N|460=4|167=FOR|65=OR|55=GBP/USD|10=111
T.1.1|9=449|35=AE|34=1734|1128=8|49=REPOFIXUAT|56=TR_UAT_VENDOR|52=20140402-11:38:34|552=1|54=2|37=20140402-12:36:48|11=NOREF|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|453=4|448=ZERO|452=3|447=D|448=MBY2|452=1|447=D|448=LMEB|452=16|447=D|448=DOR|452=11|447=D|571=7852455|1003=2 USD|150=F|32=50000000.00|15=GBP|1056=88330000.00|31=1.6666|194=1.6654|195=0.0012|64=20140415|63=B|60=20140402-11:07:33|75=20140402|1057=N|460=4|167=FOR|65=OR|55=GBP/USD|
111
8=FIXT.1.1|9=449|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|10=111
T.1.1|9=449|35=AE|34=1734|49=REPOFIXUAT|52=20140402-11:38:34|56=TR_UAT_VENDOR|1128=8|15=GBP|31=1.7666|32=50000000.00|55=GBP/USD|60=20140402-11:07:33|63=B|64=20140415|65=OR|75=20140402|150=F|167=FOR|194=1.7654|195=0.0012|460=4|571=7852455|1003=2 USD|1056=88330000.00|1057=N|552=1|54=2|37=20140402-12:36:48|11=NOREF|453=4|448=ZERO|447=D|452=3|448=MBY2|447=D|452=1|448=LMEB|447=D|452=16|448=DOR|447=D|452=11|826=0|78=1|79=default|80=50000000.00|5967=88330000.00|
111
备注强>
Pattern
,并使用常量Pattern
Matcher
,但您可以在find
语句而不是if
语句中调用while
,假设每行有一个日志答案 1 :(得分:0)
因此,将所有FIX消息保存到数据库,一旦它们被“处理”并避免所有这些,这是一个好主意。
答案 2 :(得分:0)
我理解我在FIX消息中使用"|" (pipe)
作为分隔符的问题,但它也存在于日志注释(02-04-14 14:30:45.139|QFJ Timer|input/FIXInput1|INFO|Sending FIX session message: 8=FIXT.1.1|9=71|35=0|34=2072|49=UAT|52=20140402-14:30:45.139|56=FIXUAT|10=140|
02-04-14 14:30:45.141|QFJ Timer|input/FIXInput1|DEBUG|FIX message as XML: <?xml version="1.0" encoding="ISO-8859-1"?><message>
)中,该注释未处理模式直到{{1现在我在上面的模式中将FIX消息标记之间的分隔符设置为10=checksum
ie'8 = FIX(。?)(?&lt; =)~10 =(。 ?)〜'然后它适用于我,它类似于您提供的REGEX作为上述解决方案。