我是R的新手。这次我真的需要读取包括时间,ip等数据的数据:
18:00:04.940864 129.63.50.235.53 > 129.63.71.70.1111: udp 107
18:00:04.957456 129.63.80.240.161 > 129.63.152.10.39518: udp 151
18:00:04.958432 129.63.152.10.39518 > 129.63.80.240.161: udp 136 (DF)
18:00:04.963312 217.79.96.182.53 > 129.63.1.1.1564: udp 48 (DF)
18:00:05.000976 129.63.50.235.1028 > 218.232.110.133.53: udp 34
18:00:05.207888 129.63.50.235.1028 > 203.50.0.24.53: udp 32
我从
开始read.table(file='sample.txt',head=F,'%H:%M:%S',sep='')
比我被困在那一点因为几乎没有类型的分离:空间,'>'和':' 最后是那里可能有或没有(DF)的最后一个向量。
有人能给我一个解决这类数据的想法吗?非常感谢
答案 0 :(得分:0)
这是一种蛮力的方法。
tt <- read.table(header=FALSE, fill=TRUE, stringsAsFactors=FALSE,
text="18:00:04.940864 129.63.50.235.53 > 129.63.71.70.1111: udp 107
18:00:04.957456 129.63.80.240.161 > 129.63.152.10.39518: udp 151
18:00:04.958432 129.63.152.10.39518 > 129.63.80.240.161: udp 136 (DF)
18:00:04.963312 217.79.96.182.53 > 129.63.1.1.1564: udp 48 (DF)
18:00:05.000976 129.63.50.235.1028 > 218.232.110.133.53: udp 34
18:00:05.207888 129.63.50.235.1028 > 203.50.0.24.53: udp 32")
last <- apply(tt[-(1:4)], 1, paste, collapse=' ')
tt[,5] <- last
tt[,4] <- sub(':', '', tt[,4])
tt <- tt[c(1,2,4,5)]
> tt
## V1 V2 V4 V5
## 1 18:00:04.940864 129.63.50.235.53 129.63.71.70.1111 udp 107
## 2 18:00:04.957456 129.63.80.240.161 129.63.152.10.39518 udp 151
## 3 18:00:04.958432 129.63.152.10.39518 129.63.80.240.161 udp 136 (DF)
## 4 18:00:04.963312 217.79.96.182.53 129.63.1.1.1564 udp 48 (DF)
## 5 18:00:05.000976 129.63.50.235.1028 218.232.110.133.53 udp 34
## 6 18:00:05.207888 129.63.50.235.1028 203.50.0.24.53 udp 32