R中的多列选择和修改数据表

时间:2017-03-31 11:11:38

标签: r

我正在读取一些原始日志数据并在解析之前对其进行预处理。我阅读了日志,然后添加了一个列 - draw $ rule - 来表示要应用的解析规则。

> raw<-readLines("20130205000046 firewall_log.txt")
> draw<-as.data.frame(raw)
> draw$rule <- 0
> head(draw)
                                                                                                                                    raw
1 2013 Feb  4 06:15:59 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=42415 DPT=80 
2 2013 Feb  4 06:16:22 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=63520 DPT=80 
3 2013 Feb  4 06:16:46 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=55379 DPT=80 
4 2013 Feb  4 06:17:10 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=49425 DPT=80 
5 2013 Feb  4 06:17:34 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=34270 DPT=80 
6 2013 Feb  4 06:17:39 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=35331 DPT=80 
  rule
1    0
2    0
3    0
4    0
5    0
6    0

现在我想根据原始日志数据中匹配的模式标记行。下面的语句可以解决问题,它会选择行和标志,用1来绘制$ rule。

> draw[grep("^.*ACCEPT.*PROTO=(TCP|UDP).*$",draw$raw),2] <- 1
> head(draw)
                                                                                                                                    raw
1 2013 Feb  4 06:15:59 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=42415 DPT=80 
2 2013 Feb  4 06:16:22 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=63520 DPT=80 
3 2013 Feb  4 06:16:46 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=55379 DPT=80 
4 2013 Feb  4 06:17:10 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=49425 DPT=80 
5 2013 Feb  4 06:17:34 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=34270 DPT=80 
6 2013 Feb  4 06:17:39 [UTM9S] [kernel] WAN2LAN[ACCEPT]  IN=WAN  OUT=LAN SRC=66.249.75.193 DST=192.168.1.38 PROTO=TCP SPT=35331 DPT=80 
  rule
1    1
2    1
3    1
4    1
5    1
6    1

我遇到的困难是扩展选择,使其仅适用于与正则表达式匹配的行并将draw $规则设置为零,换句话说,只考虑先前未匹配的行。我想更新draw $ rule的值,而不是在此阶段提取子集。

如何修改语句来实现这一目标?最终结果是有一系列语句,如:

draw[<some selection condition 1> && draw$rule==0,2] <-1
draw[<some selection condition 2> && draw$rule==0,2] <-2
draw[<some selection condition 3> && draw$rule==0,2] <-3
  etc

提前致谢!

0 个答案:

没有答案