说我正在考虑不同人群的比例
Gender: M = .5, F = .5
Aged = .2, NotAged = .8
Education = "Above High School" = .4, "Below High School" = .6
现在我有一个长格式数据框
a <- data.frame(Variable = c("aged", "NotAged", "Above HS", "Below HS"),
Male = c(.2, .8, .4, .6),
Female = c(.2, .8, .4, .6))
现在我想用%部分填充以下数据框:例如
Gender | Aged | Education | %
Male |NotAged| Below HS | .24
中的所有组合
b <- expand.grid(Gender = c("Male", "Female"),
Aged = c("Aged", "NotAged"),
Education = c("Above HS", "Below HS"))
我希望尽可能不使用循环,因为我可能有超过3个分组标准
由于
答案 0 :(得分:0)
沿着这些方向的某些东西可能是一个开始......
library(reshape)
a2 <- melt(a)
names(a2)[2] <- "Gender"
a2$Aged <- a2$Variable
a2$Aged[!a2$Aged %in% c("aged", "NotAged")] <- NA
a2$Education <- a2$Variable
a2$Education[!a2$Education %in% c("Above HS", "Below HS")] <- NA
a2$Variable <- NULL
a2 <- a2[,c("Gender", "Aged", "Education", "value")]
结果
> a
Gender Aged Education value
1 Male aged <NA> 0.2
2 Male NotAged <NA> 0.8
3 Male <NA> Above HS 0.4
4 Male <NA> Below HS 0.6
5 Female aged <NA> 0.2
6 Female NotAged <NA> 0.8
7 Female <NA> Above HS 0.4
8 Female <NA> Below HS 0.6
但其余的我不确定你想走哪条路。
答案 1 :(得分:0)
目前我能得到的最简洁的解决方案是使用dplyr :: left_join(或base :: merge)
library(reshape2)
library(dplyr)
a <- data.frame(Variable = c("Aged", "NotAged", "Above HS", "Below HS"),
Male = c(.2, .8, .4, .6),
Female = c(.2, .8, .4, .6))
# Create a full list for all combinations
FullList <- expand.grid(Gender = c("Male", "Female"),
Aged = c("Aged", "NotAged"),
Education = c("Above HS", "Below HS"))
# reshape a to long-format and divide it into two tables
a_long <- a %>% melt(id = "Variable", variable.name = "Gender")
tbl_Aged <- a_long %>% filter(Variable %in% c("Aged", "NotAged")) %>% rename(Aged = Variable)
tbl_Education <- a_long %>% filter(Variable %in% c("Above HS", "Below HS")) %>% rename(Education = Variable)
Results <- FullList %>%
left_join(tbl_Aged, by = c("Aged", "Gender")) %>% rename(Aged_Perc = value) %>% # Mapping Aged
left_join(tbl_Education, by = c("Education", "Gender")) %>% rename(Educ_Perc = value) %>% # Mapping Edu
mutate(Perc = Aged_Perc * Educ_Perc)
# Check
Results %>% group_by(Gender) %>% summarise(sum(Perc))