我有一个csv文件,每个唯一ID有多行,我需要将其格式化为数据帧的单行。在阅读此文件后,我最终获得了一个初始数据框:
id week v1 v2
01 week1 3 2
01 week2 5 2
01 week3 2 3
02 week1 1 2
02 week2 5 5
03 week1 4 1
03 week2 4 3
03 week3 4 2
[etc...]
我想要为给定的id提取v1的所有实例,因此我抓住所有唯一ID
uniqid<-unique(data$id)
然后从1:length(uniqid)
迭代这些temp <- subset(data,data$id==uniqid[i])
并将每周数据拉入临时变量
week1 <- temp$v1[temp$week=="week1]
所以我可以使用rbind
改造数据帧output <- rbind(output,data.frame(ID=uniqid[i],week1,week2,week3))
我的问题是,例如id = 02,没有week3,所以rbind会中断。似乎永远不会创建week3变量;它没有显示为NA。如何测试变量是否已创建并将其设置为NA(或0)以便rbind不会失败?或者是否有完全不同/更有效的方法来实现这一目标?
答案 0 :(得分:1)
您可以使用reshape2包中的recast
功能。
DF
## id week v1 v2
## 1 1 week1 3 2
## 2 1 week2 5 2
## 3 1 week3 2 3
## 4 2 week1 1 2
## 5 2 week2 5 5
## 6 3 week1 4 1
## 7 3 week2 4 3
## 8 3 week3 4 2
require(reshape2)
temp <- recast(DF, id ~ week, measure.var = "v1")
result <- temp$data
row.names(result) <- temp$labels[[1]]$id
colnames(result) <- temp$labels[[2]]$week
result
## week1 week2 week3
## 1 3 5 2
## 2 1 5 NA
## 3 4 4 4
或@AnandaMahto建议,只需使用dcast
dcast(DF, id ~ week, value.var = "v1")
## id week1 week2 week3
## 1 1 3 5 2
## 2 2 1 5 NA
## 3 3 4 4 4
答案 1 :(得分:1)
在基础R中,您可以使用reshape
:
> reshape(mydf, direction = "wide", idvar="id", timevar="week")
id v1.week1 v2.week1 v1.week2 v2.week2 v1.week3 v2.week3
1 1 3 2 5 2 2 3
4 2 1 2 5 5 NA NA
6 3 4 1 4 3 4 2
如果要从输出中删除“v2”列,可以在重新整形数据之前执行此操作,也可以从函数中删除它。
> reshape(mydf, direction = "wide", idvar="id", timevar="week", drop="v2")
id v1.week1 v1.week2 v1.week3
1 1 3 5 2
4 2 1 5 NA
6 3 4 4 4