重命名data.table中重复字符向量的元素,然后重新整形数据

时间:2017-01-10 20:52:23

标签: r data.table dplyr tidyr

在data.table的一列中,我有重复的元素。我想在每个组中连续重命名这些,以便我可以使用dcast功能。让我用以下例子解释:

DT <- data.table(v1 = letters[c(1,rep(4,3),10,1,rep(7,3),1,10,12)],  
             v2=month.abb[1:12],  
       v3=c("AV",rep("SINGLE",3),"DAT","AV",rep("SINGLE",3),"AV","DAT","R"),  
             v4 = c(3L, 2L, 2L, 5L, 4L, 3L,6L,11L,1L,7L,12L,20L))  
DT  
# I Want to create a new variable,"newv" where the factor,"SINGLE" in the variable (v3)      
# to be named sequentially with the first letter as 's' followed by an integer  

## unsuccessful attempt  
#DT[v3 == SINGLE", newv:= for (i in 1:.N) {paste0('s',i)}, by="v1"]  
# What I want is the following data table
DT1 <- data.table(v1 = letters[c(1,rep(4,3),10,1,rep(7,3),1,10,12)],
             v2=month.abb[1:12],  
v3=c("AV",rep("SINGLE",3),"DAT","AV",rep("SINGLE",3),"AV","DAT","R"),  
v4 = c(3L, 2L, 2L, 5L, 4L, 3L,6L,11L,1L,7L,12L,20L),  
newv=c("AV","s1","s2","s3","DAT","AV","s1","s2","s3","AV","DAT","R"))  

DT1
# Now proceeding with dcast
dt2<-DT1[,`:=`('v2'=NULL,'v3'=NULL)] # to get the right shape in the final result  
dcast.data.table(dt2,v1~newv,value.var="v4")
#Aggregate function missing, defaulting to 'length'
# Instead of the length, I would like to have the values from variable "v4"
# How do I get that?

基本上,我没有成功实现我的目标:(1)用顺序字符串重命名公共因子“SINGLE”和(2)重新整形数据表以获得重命名的字符串s1到s3作为列带有“v4”值的标题出现在表格中。

我很感激能得到的任何帮助。使用data.table或dplyr / tidyr。

0 个答案:

没有答案