我正在使用R,我有一个如下所示的数据框:
df<-data.frame(
tree_id=c("t1","t1","t1","t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2","t2","t2","t2"),
branch_id=c("b1","b1","b1","b1","b1","b1","b1","b3","b3","b1","b1","b1","b1","b2","b2","b2","b2","b2"),
bud_id_rank=c("1","2","4","7","9","12","15","1","3","1","2","5","9","1","5","7","8","12")
)
我想添加一个名为“new_rank”的新列,该列具有基于
的1的递增数字
branch_id和tree_id。结果应该是这样的:
df<-data.frame(
tree_id=c("t1","t1","t1","t1","t1","t1","t1","t1","t1","t2","t2","t2","t2","t2","t2","t2","t2","t2"),
branch_id=c("b1","b1","b1","b1","b1","b1","b1","b3","b3","b1","b1","b1","b1","b2","b2","b2","b2","b2"),
bud_id_rank=c("1","2","4","7","9","12","15","1","3","1","2","5","9","1","5","7","8","12"),
new_rank=c("1","2","3","4","5","6","7","1","2","1","2","3","4","1","2","3","4","5")
)
是否有任何功能可以快速执行,例如包plyr?
提前致谢
答案 0 :(得分:2)
如果您现在已经使用了哪个包我想知道为什么您自己无法解决这个问题。
library(plyr)
ddply(df,.(tree_id,branch_id), transform, new_rank = seq_along(branch_id))
如果您的数据集很大,data.table会更快:
library(data.table)
DT <- as.data.table(df)
DT[, new_rank:=seq_along(bud_id_rank), by=list(tree_id, branch_id)]