如何有效地计算加权比例和置信区间?

时间:2020-03-03 03:50:37

标签: r

我想知道是否有一种有效的方法来计算按其他变量分组的几个变量的加权比例以及95%CI?这是我的示例数据

      sex          hpvvac  wtmec4yr ohpv06 ohpv11 ohpv16 ohpv18 ohpv26
1    Male            <NA> 67814.750      0      0      0      0      0
2    Male No HPV vaccined 12641.213      0      0      0      0      0
3  Female No HPV vaccined 51039.316      0      0      0      0      0
4    Male    HPV vaccined 19676.654      0      0      0      0      0
5  Female No HPV vaccined 11778.582      0      0      0      0      0
6    Male No HPV vaccined  9124.663      0      0      0      0      0
7    Male No HPV vaccined 10034.331      0      0      1      1      0
8    Male No HPV vaccined 17836.982      0      0      1      0      0
9    Male No HPV vaccined 48500.992      0      0      0      0      0
10 Female No HPV vaccined 19340.266      0      0      0      0      0
structure(list(sex = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 1L, 
1L, 1L, 2L), .Label = c("Male", "Female"), class = "factor"), 
    hpvvac = structure(c(NA, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 
    1L), .Label = c("No HPV vaccined", "HPV vaccined"), class = "factor"), 
    wtmec4yr = c(67814.75, 12641.212890625, 51039.31640625, 19676.654296875, 
    11778.58203125, 9124.6630859375, 10034.3310546875, 17836.982421875, 
    48500.9921875, 19340.265625), ohpv06 = c(0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0), ohpv11 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), ohpv16 = c(0, 
    0, 0, 0, 0, 0, 1, 1, 0, 0), ohpv18 = c(0, 0, 0, 0, 0, 0, 
    1, 0, 0, 0), ohpv26 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 
10L), class = "data.frame")

wtmec4yr是我每次观察的体重。 ohpv*是二进制变量。我想按ohpv*sex计算hpvvac组值1的比例,以及95%CI。我尝试手动执行此操作,但我怀疑这样做是不正确的,而且效率不高。我也尝试过survey程序包,但每次只能计算一个变量。

d.s <- svydesign(ids=~1, data=mydt, weights =~wtmec4yr)
a <- svyby(~ohpv06, ~hpvvac+sex,d.s,svymean, na.rm=F)
ftable(a)
confint(a)

谢谢!

1 个答案:

答案 0 :(得分:1)

您可以提供公式:

library(survey)
svyby(~ohpv06+ohpv11+ohpv16+ohpv18+ohpv26, ~hpvvac+sex,d.s,svymean, na.rm=F)

或者如果您所有的列都以ohpv开头:

FORM = paste("~",paste(grep("ohpv",colnames(mydt),value=TRUE),collapse=" + "))
svyby(as.formula(FORM), ~hpvvac+sex,d.s,svymean, na.rm=F)

他们两个都给你以下内容:

                                hpvvac    sex ohpv06 ohpv11
No HPV vaccined.Male   No HPV vaccined   Male      0      0
HPV vaccined.Male         HPV vaccined   Male      0      0
No HPV vaccined.Female No HPV vaccined Female      0      0
                          ohpv16   ohpv18 ohpv26 se.ohpv06 se.ohpv11
No HPV vaccined.Male   0.2840007 0.102247      0         0         0
HPV vaccined.Male      0.0000000 0.000000      0         0         0
No HPV vaccined.Female 0.0000000 0.000000      0         0         0
                       se.ohpv16 se.ohpv18 se.ohpv26
No HPV vaccined.Male   0.2211842  0.113473         0
HPV vaccined.Male      0.0000000  0.000000         0
No HPV vaccined.Female 0.0000000  0.000000         0