我正在读取一些原始日志数据并在解析之前对其进行预处理。我阅读了日志,然后添加了一个列 - draw $ rule - 来表示要应用的解析规则。
> raw<-readLines("20130205000046 firewall_log.txt")
> draw<-as.data.frame(raw)
> draw$rule <- 0
> head(draw)
raw
1 2013 Feb 4 06:15:59 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=42415 DPT=80
2 2013 Feb 4 06:16:22 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=63520 DPT=80
3 2013 Feb 4 06:16:46 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=55379 DPT=80
4 2013 Feb 4 06:17:10 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=49425 DPT=80
5 2013 Feb 4 06:17:34 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=34270 DPT=80
6 2013 Feb 4 06:17:39 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=35331 DPT=80
rule
1 0
2 0
3 0
4 0
5 0
6 0
现在我想根据原始日志数据中匹配的模式标记行。下面的语句可以解决问题,它会选择行和标志,用1来绘制$ rule。
> draw[grep("^.*ACCEPT.*PROTO=(TCP|UDP).*$",draw$raw),2] <- 1
> head(draw)
raw
1 2013 Feb 4 06:15:59 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=42415 DPT=80
2 2013 Feb 4 06:16:22 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=63520 DPT=80
3 2013 Feb 4 06:16:46 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=55379 DPT=80
4 2013 Feb 4 06:17:10 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=49425 DPT=80
5 2013 Feb 4 06:17:34 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=34270 DPT=80
6 2013 Feb 4 06:17:39 [UTM9S] [kernel] WAN2LAN[ACCEPT] IN=WAN OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=35331 DPT=80
rule
1 1
2 1
3 1
4 1
5 1
6 1
我遇到的困难是扩展选择,使其仅适用于与正则表达式匹配的行并将draw $规则设置为零,换句话说,只考虑先前未匹配的行。我想更新draw $ rule的值,而不是在此阶段提取子集。
如何修改语句来实现这一目标?最终结果是有一系列语句,如:
draw[<some selection condition 1> && draw$rule==0,2] <-1
draw[<some selection condition 2> && draw$rule==0,2] <-2
draw[<some selection condition 3> && draw$rule==0,2] <-3
etc
提前致谢!