我正在处理调查数据。我需要对数据进行一些表格和回归分析。 附加数据后,这是我用于四个变量表的代码:
ftable(var1,var2,var3,var4)
这是我用于数据的回归代码:
logit.1< - glm(var4~var3 + var2 + var1,family = binomial(link =“logit”)) 摘要(logit.1)
到目前为止,未加权的分析非常有用。但是如何对加权数据进行相同的分析呢?这是一些额外的信息: 数据集中有四个反映采样结构的变量。这些是
strat:stratum(城市或(分县)农村)。
clust:一系列采访是同一随机游走的一部分
vill_neigh_code:村庄或社区代码
sweight:weight
答案 0 :(得分:1)
library(survey)
data(api)
# example data set
head( apiclus2 )
# instead of var1 - var4, use these four variables:
ftable( apiclus2[ , c( 'sch.wide' , 'comp.imp' , 'both' , 'awards' ) ] )
# move it over to x for faster typing
x <- apiclus2
# also give x a column of all ones
x$one <- 1
# run the glm() function specified.
logit.1 <-
glm(
comp.imp ~ target + cnum + growth ,
data = x ,
family = binomial( link = 'logit' )
)
summary( logit.1 )
# now create the survey object you've described
dclus <-
svydesign(
id = ~dnum + snum , # cluster variable(s)
strata = ~stype , # stratum variable
weights = ~pw , # weight variable
data = x ,
nest = TRUE
)
# weighted counts
svyby(
~one ,
~ sch.wide + comp.imp + both + awards ,
dclus ,
svytotal
)
# weighted counts formatted differently
ftable(
svyby(
~one ,
~ sch.wide + comp.imp + both + awards ,
dclus ,
svytotal ,
keep.var = FALSE
)
)
# run the svyglm() function specified.
logit.2 <-
svyglm(
comp.imp ~ target + cnum + growth ,
design = dclus ,
family = binomial( link = 'logit' )
)
summary( logit.2 )