我有一个如下所示的数据框:
'data.frame': 1090 obs. of 8 variables:
$ id : chr "INC000000209241" "INC000000218488" "INC000000218982" "INC000000225646" ...
$ service.type : chr "Incident" "Incident" "Incident" "Incident" ...
$ priority : chr "Critical" "Critical" "Critical" "Critical" ...
我按如下方式订购数据:
data <- data[order(data$priority),]
我一直在改变因素等优先级,但无论我尝试什么,当我尝试运行以下内容时:
s = strata(data,c("priority"),size=c(0,0,1,5))
我总是收到以下错误:
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 0, 1
我尝试调试该函数以查看是否可以告诉为什么会出现此错误(但我无法理解代码)。在执行strata()函数的这个阶段引发了错误:
debug: r = cbind(r, i)
非常感谢你的帮助!
答案 0 :(得分:5)
问题在于您尝试将某些组的样本大小设置为零。相反,在采样之前对原始数据进行子集化。
在这里,我们重现您的问题。
library(sampling)
data(swissmunicipalities)
length(table(swissmunicipalities$REG)) # We have seven strata
# [1] 7
# Let's take two from each group
strata(swissmunicipalities,
stratanames = c("REG"),
size = rep(2, 7),
method="srswor")
# REG ID_unit Prob Stratum
# 93 4 93 0.011695906 1
# 145 4 145 0.011695906 1
# 2574 1 2574 0.003395586 2
# 2631 1 2631 0.003395586 2
# 826 3 826 0.006230530 3
# 1614 3 1614 0.006230530 3
# 583 2 583 0.002190581 4
# 1017 2 1017 0.002190581 4
# 1297 5 1297 0.004246285 5
# 2535 5 2535 0.004246285 5
# 342 6 342 0.010752688 6
# 347 6 347 0.010752688 6
# 651 7 651 0.008163265 7
# 2471 7 2471 0.008163265 7
# Let's try to drop the first two groups. Oops...
strata(swissmunicipalities,
stratanames = c("REG"),
size = c(0, 0, 2, 2, 2, 2, 2),
method="srswor")
# Error in data.frame(..., check.names = FALSE) :
# arguments imply differing number of rows: 0, 1
让我们的子集再试一次。
swiss2 <- swissmunicipalities[!swissmunicipalities$REG %in% c(1, 2), ]
table(swiss2$REG)
strata(swiss2,
stratanames = c("REG"),
size = c(2, 2, 2, 2, 2),
method="srswor")
# REG ID_unit Prob Stratum
# 58 4 58 0.011695906 1
# 115 4 115 0.011695906 1
# 432 3 432 0.006230530 2
# 986 3 986 0.006230530 2
# 1007 5 1007 0.004246285 3
# 1150 5 1150 0.004246285 3
# 190 6 190 0.010752688 4
# 497 6 497 0.010752688 4
# 1049 7 1049 0.008163265 5
# 1327 7 1327 0.008163265 5