数据集的R.拆分字符串中的难题,但维护其他列中的信息

时间:2010-07-08 00:34:14

标签: r split

我正在使用R.以下面创建的数据集为例。我希望能够将ip"."分开,同时将原始行信息保存在colorstatus中。我知道这会创建一个更长的数据集,其中colorstatus的条目会重复出现。

a <- data.frame(cbind(color=c("yellow","red","blue","red"),
       status=c("no","yes","yes","no"),
       ip=c("162.131.58.26","2.131.58.16","2.2.58.10","162.131.58.17")))

3 个答案:

答案 0 :(得分:3)

不清楚OP是否需要新行或列,所以这两者都是:

列:

library(reshape)
a <- data.frame(a, colsplit(a$ip, split = "\\.", names = c("foo", "bar", "baz", "phi")))

或行(在添加上面的列之后)

a.m <- melt(a, id.vars = c("color", "status", "ip"))

答案 1 :(得分:0)

a <- cbind(a[,1:2], t(matrix(as.numeric(unlist(strsplit(as.character(a[,3]), "\\."))), nrow = nrow(a), ncol = 4)))

不确定这是否是您想要的,我确信即使它是您想要的,也有更好的方法。

答案 2 :(得分:0)

# give a an id to match cases
a$id <- 1:nrow(a)

# split the ip address and store in datab
datab <- unlist(strsplit(as.character(a$ip),"\\."))

# put the parts of the ip address against the correct ids in a new dataframe
datac <- data.frame(id=sort(rep(1:4,nrow(a))),ip=datab)

# merge the data together, remove unwanted variables, correct column name
final <- merge(datac,a,by="id")
final <- final[c("ip.x","color","status")]
colnames(final)[1] <- "ip"

这将在新行上为您提供IP地址的每个部分,颜色和状态变量重复。我希望这就是你所追求的。否则,前面的答案看起来很好,让ip数据进入列而不是行。