我有一个这样的数据框:
data.frame(home=c("A","B","C","A","C"),weight=c(0.1,0.25,0.36,0.14,0.2),region=c("north","south","east","North","south"))
Home Weigth region
A 0.1 North
B 0.25 South
C 0.36 East
A 0.14 North
C 0.2 South
我希望聚合我的data.frame跟随两个因子变量并总和第三个。结果将给出:
data.frame(home=c("A","B","C"),north=c(0.24,0,0),south=c(0,0.25,0.2),east=c(0.36,0,0))
Home North South East
A 0.24 0 0
B 0 0.25 0
C 0 0.2 0.36
我正在尝试使用像聚合这样的快速简单的功能,但也许唯一的解决方案是使用我想要的东西手动制作data.frame
答案 0 :(得分:1)
数据强>
df <- data.frame(
home = c("A", "B", "C", "A", "C"),
weight = c(0.1, 0.25, 0.36, 0.14, 0.2),
region = c("north", "south", "east", "North", "south")
)
library(tidyr)
spread(df, region, weight, fill = 0)
library(reshape2)
dcast(df, home ~ region, value.var = "weight", fill = 0)
# xtabs
xtabs(weight ~ home + region, data = df)
# reshape
df_wide <-reshape(df, idvar ='home', timevar ='region', direction ='wide')
df_wide[is.na(df_wide)] <- 0
<强>输出强>
home east north North south
1 A 0.00 0.1 0.14 0.00
2 B 0.00 0.0 0.00 0.25
3 C 0.36 0.0 0.00 0.20
答案 1 :(得分:0)
基本上有两个步骤,(1)总和; (2)将结果转化为双向表
library(dplyr)
df <- data.frame(home=c("A","B","C","A","C"),weight=c(0.1,0.25,0.36,0.14,0.2),region=c("north","south","east","North","south"))
df$region <- Hmisc::capitalize(as.character(df$region))
df_sum <- df %>% group_by(home, region) %>% summarize(weight_sum = sum(weight, na.rm=TRUE))
reshape2::dcast(df_sum, home ~ region, function(V) sum(V, na.rm=TRUE))
第二笔金额是无关的,并非真的有必要,我在这里只是为了避免将NA
转换为0的额外步骤。
答案 2 :(得分:0)
我想这样做,h01是你想要的结果
x00<-data.frame(home=c("A","B","C","A","C"),weight=c(0.1,0.25,0.36,0.14,0.2),
region=c("north","South","East","North","South"),stringsAsFactors = F)
x00$region<-tolower(x00$region)
x01<-ddply(x00,.(region,home),summarize,result=sum(weight))
h01<-data.frame(north=c(0,0,0),south=c(0,0,0),east=c(0,0,0),row.names = c("A","B","C"))
for (x in 1:nrow(x01)){
h01[x01$home[x],x01$region[x]]=x01$result[x]
}
h01$Home=row.names(h01)
row.names(h01)<-c()