我试图从加权调查数据集中分析多重回答问题。我喜欢srvyr
包,因为它允许我使用dplyr管道,但我无法找到有关如何处理多个响应问题的参考资料。
我有一个简单的数据集,可以查看不同的收入来源。以下是数据如何显示的示例
ID <- c(1,2,3,4,5,6,7,8,9,10)
rent_income <- c("Yes", "Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No", "No")
salary_income <- c( "No", "Yes", "No", "Yes", "No", "Yes", "Yes", "No", "Yes", "No")
other_income <- c( "No", "Yes", "No", "No", "No", "No", "Yes", "No", "No", "No")
survey_weights <- c(0.6, 1.2 , 1.1 , 0.7 , 2.4 , 1.1 , 0.3 , 0.6 , 1.0 , 0.8)
df<-data.frame(ID, rent_income, salary_income, other_income, survey_weights)
请注意,数据完全由数据组成。如果首先必须创建调查对象,则使用srvyr
weighted_dataset <- df %>% as_survey_design(ids=ID, weights=survey_weights)
现在我想计算具有不同收入类型的样本的加权百分比。有关如何做到这一点的任何想法?在Stata中有一个名为mr_tab的函数。但我在R
中找不到类似的答案 0 :(得分:0)
查看https://cran.r-project.org/web/packages/srvyr/vignettes/srvyr-vs-survey.html
的proportions by group
块
答案 1 :(得分:0)
您可以使用group_by()
和dplyr
R软件包提供的便捷srvyr
和变量选择语法。
weighted_dataset %>%
# Organize the data into groups defined by each combination of the income variables
group_by_at(vars(ends_with("_income"))) %>%
# For categorical variables, this calculates estimates of percentages
summarize(Percent = survey_mean())
> # A tibble: 6 x 5
> rent_income salary_income other_income Percent Percent_se
> <fct> <fct> <fct> <dbl> <dbl>
> 1 No No No 1 0
> 2 No Yes No 0.769 0.265
> 3 No Yes Yes 0.231 0.265
> 4 Yes No No 1 0
> 5 Yes Yes No 0.6 0.312
> 6 Yes Yes Yes 0.40 0.312