使用正则表达式拆分日志文本,忽略第一个匹配

时间:2016-11-10 03:11:21

标签: java regex split

我有以下日志文​​本,我想使用#User @ Host 的正则​​表达式进行拆分。我正在使用Java正则表达式lib函数。

# Time: 160204  1:56:31
# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000142  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1454579791;
SELECT DATABASE();
# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.001254  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
use test;
SET timestamp=1454579791;
# administrator command: Init DB;
# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000441  Lock_time: 0.000077 Rows_sent: 4  Rows_examined: 4
SET timestamp=1454579791;
show databases;
# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000207  Lock_time: 0.000074 Rows_sent: 1  Rows_examined: 1
SET timestamp=1454579791;
show tables;
# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000537  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0
SET timestamp=1454579791;
;

如果我这样做,我会得到以下6个字符串。

字符串1:

# Time: 160204  1:56:31

字符串2:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000142  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1454579791;
SELECT DATABASE();

字符串3:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.001254  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
use test;
SET timestamp=1454579791;
# administrator command: Init DB;

字符串4:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000441  Lock_time: 0.000077 Rows_sent: 4  Rows_examined: 4
SET timestamp=1454579791;
show databases;

字符串5:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000207  Lock_time: 0.000074 Rows_sent: 1  Rows_examined: 1
SET timestamp=1454579791;
show tables;

字符串6:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000537  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0
SET timestamp=1454579791;
;

因此,使用#User @ Host 作为正则表达式拆分后会返回6个字符串。我实际上只对五个字符串感兴趣,这两个字符串首先合并。所以,结果看起来应该是

字符串1:

# Time: 160204  1:56:31
# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000142  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=1454579791;
SELECT DATABASE();

字符串2:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.001254  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
use test;
SET timestamp=1454579791;
# administrator command: Init DB;

字符串3:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000441  Lock_time: 0.000077 Rows_sent: 4  Rows_examined: 4
SET timestamp=1454579791;
show databases;

字符串4:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000207  Lock_time: 0.000074 Rows_sent: 1  Rows_examined: 1
SET timestamp=1454579791;
show tables;

字符串5:

# User@Host: root[root] @ localhost []  Id:     3
# Query_time: 0.000537  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0
SET timestamp=1454579791;
;

我怎么能做到这一点?

1 个答案:

答案 0 :(得分:0)

您可以在拆分后附加第一个和第二个元素:

String string1 = splitArray[0] + splitArray[1];

当然,这只有在你知道格式总是如你所列的那样时才有效。为了确保这种情况,您可以添加一个类似于以下内容的检查:

if(splitArray[0].startsWith("# Time:"){
    String string1 = splitArray[0] + splitArray[1];
}

我确信有更优雅的方法可以实现这一目标,但这会有效;)