所以我试图在这里解析一些代码以从日志文件中获取消息文本。我会随便解释。这是代码:
// Print to interactions
try
{
// assigns the input file to a filereader object
BufferedReader infile = new BufferedReader(new FileReader(log));
sc = new Scanner(log);
while(sc.hasNext())
{
String line=sc.nextLine();
if(line.contains("LANTALK")){
Document doc = Jsoup.parse(line);
Element idto = doc.select("MBXTO").first();
Element msg = doc.select("MSGTEXT").first();
System.out.println(" to " + idto.text() + " " +
msg.text());
System.out.println();
} // End of if
} // End of while
try
{
// Print to output file
sc = new Scanner (log);
while(sc.hasNext())
{
String line=sc.nextLine();
if(line.contains("LANTALK")){
Document doc = Jsoup.parse(line);
Element idto = doc.select("MBXTO").first();
Element msg = doc.select("MSGTEXT").first();
outFile.println(" to " + idto.text() + " " +
msg.text());
outFile.println();
outFile.println();
} // End of if
} // End of while
} // end of try
我从日志文件中获取输入,这里是一个示例,以及我过滤掉的行:< p>
08:25:20.740 [D] [T:000FF0] [F:LANTALK2C] <CMD>LANMSG</CMD>
<MBXID>1124</MBXID><MBXTO>5760</MBXTO><SUBTEXT>LanTalk</SUBTEXT><MOBILEADDR>
</MOBILEADDR><LAP>0</LAP><SMS>0</SMS><MSGTEXT>and I talked to him and he
gave me a credit card number</MSGTEXT>
08:25:20.751 [+] [T:000FF0] [S:1:1:1124:5607:5] LANMSG [15/2 | 0]
08:25:20.945 [+] [T:000FF4] [S:1:1:1124:5607:5] LANMSGTYPESTOPPED [0/2 | 0]
08:25:21.327 [+] [T:000FE8] [S:1:1:1124:5607:5] LANMSGTYPESTARTED [0/2 | 0]
到目前为止,我已经能够过滤包含消息的行(LANMSG
)。从那以后,我就能够获得收件人的身份证号码(MBXTO
)。但下一行包含发件人的ID,我需要将其拉出并显示。 ([S:1:1:1124:SENDERID:5]
)。我该怎么做?以下是我得到的输出的副本:
to 5760 and I talked to him and he gave me a credit card number
这就是我需要得到的东西:
SENDERID to 5760 and I talked to him and he gave me a credit card number
你们可以给我的任何帮助都会很棒。我只是不确定如何获取我需要的信息。
答案 0 :(得分:0)
你的答案不够清楚,但是因为你似乎没有在这段代码中使用正则表达式...记得在询问之前指明你尝试了什么。 无论如何,你正在寻找的正则表达式是:
(\d{2}:\d{2}:\d{2}\.\d{3})\s\[D\].+<MBXID>(\d+)<\/MBXID><MBXTO>(\d+)<\/MBXTO>.+<MSGTEXT>(.+)<\/MSGTEXT>
Working example in Regex101
它应该捕获:
$ 1 :08:25:20.740
$ 2 :1124
$ 3 :5760
$ 4 :and I talked to him and he
gave me a credit card number
(请注意,它还会捕获\ n或换行符)。
(另外,您在Java中使用matcher.group(number)
而不是$number
。)
然后您可以使用这些替换(组参考)术语来获取格式化的输出。
例如:$1 [$2] to [$3] $4
应该返回:
08:25:20.740 [1124] to [5760] and I talked to him and he
gave me a credit card number
请记住,当您要在Java代码中实现正则表达式时,必须转义所有反斜杠(\
),因此,此正则表达式看起来更大:
Pattern pattern = Pattern.compile("(\\d{2}:\\d{2}:\\d{2}\\.\\d{3})\\s\\[D\\].+<MBXID>(\\d+)<\\/MBXID><MBXTO>(\\d+)<\\/MBXTO>.+<MSGTEXT>(.+)<\\/MSGTEXT>", Pattern.MULTILINE + Pattern.DOTALL);
// Multiline is used to capture the LANMSG more than once, and Dotall is used to make the '.' term in regex also match the newline in the input
Matcher matcher = pattern.matcher(input);
while (matcher.find()){
String output = matcher.group(1) + " [" + matcher.group(2) + "] to [" + matcher.group(3) + "] " + matcher.group(4);
System.out.println(output);
}
对于你的第二个问题哦,你已经编辑并删除了它。 。 。但我还是会回答:
您可以解析$2
和$3
并使它们返回一个整数:
int id1 = Integer.parseInt(matcher.group(2));
int id2 = Integer.parseInt(matcher.group(3));
这样您就可以创建一个方法来返回这些ID的名称。例如:UserUtil.getName(int id)