在data.table的一列中,我有重复的元素。我想在每个组中连续重命名这些,以便我可以使用dcast功能。让我用以下例子解释:
DT <- data.table(v1 = letters[c(1,rep(4,3),10,1,rep(7,3),1,10,12)],
v2=month.abb[1:12],
v3=c("AV",rep("SINGLE",3),"DAT","AV",rep("SINGLE",3),"AV","DAT","R"),
v4 = c(3L, 2L, 2L, 5L, 4L, 3L,6L,11L,1L,7L,12L,20L))
DT
# I Want to create a new variable,"newv" where the factor,"SINGLE" in the variable (v3)
# to be named sequentially with the first letter as 's' followed by an integer
## unsuccessful attempt
#DT[v3 == SINGLE", newv:= for (i in 1:.N) {paste0('s',i)}, by="v1"]
# What I want is the following data table
DT1 <- data.table(v1 = letters[c(1,rep(4,3),10,1,rep(7,3),1,10,12)],
v2=month.abb[1:12],
v3=c("AV",rep("SINGLE",3),"DAT","AV",rep("SINGLE",3),"AV","DAT","R"),
v4 = c(3L, 2L, 2L, 5L, 4L, 3L,6L,11L,1L,7L,12L,20L),
newv=c("AV","s1","s2","s3","DAT","AV","s1","s2","s3","AV","DAT","R"))
DT1
# Now proceeding with dcast
dt2<-DT1[,`:=`('v2'=NULL,'v3'=NULL)] # to get the right shape in the final result
dcast.data.table(dt2,v1~newv,value.var="v4")
#Aggregate function missing, defaulting to 'length'
# Instead of the length, I would like to have the values from variable "v4"
# How do I get that?
基本上,我没有成功实现我的目标:(1)用顺序字符串重命名公共因子“SINGLE”和(2)重新整形数据表以获得重命名的字符串s1到s3作为列带有“v4”值的标题出现在表格中。
我很感激能得到的任何帮助。使用data.table或dplyr / tidyr。