希望我能够很好地解释我的问题; 所以,我有一个这样的数据框:
sample = data.frame("Room" = c("A1", "B2","A1","A3","A2"), "Name"=c("Peter","Tom","Peter","Anna","Peter"), "Class"=c("E","E","F","D","E"), "FY"=c(1,2,3,4,6))
现在我想创建一个新的数据框,它只有列“Class”列的唯一值作为列名,列“Room”和“Name”作为列。这些值应该是“FY”的总和。
如果没有“Room”栏,我会这样做:
test=as.data.frame(unclass(with(sample, tapply(FY, list(Name, Class), FUN=sum))))
但我怎么能用两列做到这一点呢?
这是我想要的输出:
output = data.frame(c("A1", "B2","A3", "A2"), c("Peter","Tom","Anna","Peter"), c(1,2,NA,6),c(3,NA,NA,NA),c(NA,NA,4,NA))
colnames(output) = c("Room", "Name","E","F","D")
答案 0 :(得分:2)
使用dcast
的简单reshape2
应该可以解决问题:
library(reshape2)
dcast(sample, Room + Name ~ Class, value.var = "FY")
# Room Name D E F
#1 A1 Peter NA 1 3
#2 A2 Peter NA 6 NA
#3 A3 Anna 4 NA NA
#4 B2 Tom NA 2 NA
答案 1 :(得分:1)
在这个具体的例子中,我相信你可以逃脱以下几点:
library(tidyverse)
sample %>%
spread(Class, FY)
但是,根据您的描述,我认为这应该涵盖更广泛的例子:
sample %>%
group_by(Room, Name, Class) %>%
summarise(FY = sum(FY)) %>%
spread(Class, FY)
答案 2 :(得分:0)
使用reshape
:
reshape(sample, idvar = c("Name", "Room"), timevar = "Class", direction = "wide")
输出:
Room Name FY.E FY.F FY.D
1 A1 Peter 1 3 NA
2 B2 Tom 2 NA NA
4 A3 Anna NA NA 4
5 A2 Peter 6 NA NA
如果需要,可以在以后更改列名称。