我尝试根据调查数据创建表格,但我提出的解决方案对于我需要创建的所有表格都无法管理。
我对不同人群,政党及其对某些问题的看法进行了调查。下面是示例数据和我(几乎)工作繁琐的解决方案。我已经在#34; ideal.table"中找到了我正在寻找的解决方案。 data.frame(如下所示)
pop <- c("elite", "elite", "public", "public", "public", "public")
party <- c("D", "R", "R", "D", "D", "R")
opinion <- c("pro", "con", "pro", "con", "pro", "pro")
df <- data.frame(pop, party, opinion)
party.table <- prop.table(table(df[df$pop=="public",][["party"]], df[df$pop=="public",][["opinion"]]),2)
elite.table <- prop.table(table(df[df$pop=="elite",][["opinion"]]))
public.table <- prop.table(table(df[df$pop=="public",][["opinion"]]))
group <- c("R", "D", "elite", "public")
percent.pro <- c(0.3, 0.6, 0.5, 0.75)
percent.con <- c(0.7, 0.4, 0.5, 0.25)
ideal.table <- data.frame(group, percent.pro, percent.con)
library(dplyr)
library(tidyr)
# create data frames from tables
x = data.frame(elite.table)
names(x) = c("elite","value")
y = data.frame(party.table) %>% spread(Var2,Freq)
names(y)[1] = "group"
z = data.frame(public.table)
names(z)[1] = "group"
# join data frames
x %>% inner_join(y, by="group") %>% inner_join(z, by="group")
我还没有找到解决方案,但即使我找到了这个特定数据集的解决方案,有时候我会将多个表与两个维度相结合,而不是这里提供的组。是否有更好的方法来获得不同数据子集的交叉表比例?
group percent.pro percent.con
1 R 0.30 0.70
2 D 0.60 0.40
3 elite 0.50 0.50
4 public 0.75 0.25
感谢您的帮助!
答案 0 :(得分:1)
library(dplyr)
library(tidyr)
df %>%
gather(variable, group, -opinion) %>%
group_by(variable, group) %>%
summarize(percent.pro = sum(opinion == "pro") / n() ) %>%
mutate(percent.com = 1 - percent.pro)