我有一个从 SPSS 导入的相对较大的数据集,我想遍历它以提取变量标签并为每列创建一个频率表。在我使用 haven
包导入数据集后,变量标签存储在 attributes(dat$col)$label
例如,如果我执行以下操作:attributes(dat$Geschlecht)$label
我得到变量标签:"5.1 Geschlecht"
为了做 for 循环,我试过这个:
col <- names(dat)
for(i in col) {
attributes(dat$i)$label
table(dat$i, useNA = "always")
}
但它会导致一个空输出,这显然是因为当我想要提取结构元素时,这不是引用列的方式。我将不胜感激有关如何修复它的任何建议。提前致谢!
数据样本:
dat <- structure(list(Geschlecht = structure(c(2, 1, 1, 1, 2, 1, 2,
1, 1, 2, 1), label = "5.1 Geschlecht", format.spss = "F11.0", labels = c(Männlich = 1,
Weiblich = 2, Divers = 3), class = c("haven_labelled", "vctrs_vctr",
"double")), Alter = structure(c(38, 50, 58, 22, 22, 68, 63, 53,
60, 30, 19), label = "5.2 Alter", format.spss = "F11.0", labels = c(`<18` = 17,
`18` = 18, `19` = 19, `20` = 20, `21` = 21, `22` = 22, `23` = 23,
`24` = 24, `25` = 25, `26` = 26, `27` = 27, `28` = 28, `29` = 29,
`30` = 30, `31` = 31, `32` = 32, `33` = 33, `34` = 34, `35` = 35,
`36` = 36, `37` = 37, `38` = 38, `39` = 39, `40` = 40, `41` = 41,
`42` = 42, `43` = 43, `44` = 44, `45` = 45, `46` = 46, `47` = 47,
`48` = 48, `49` = 49, `50` = 50, `51` = 51, `52` = 52, `53` = 53,
`54` = 54, `55` = 55, `56` = 56, `57` = 57, `58` = 58, `59` = 59,
`60` = 60, `61` = 61, `62` = 62, `63` = 63, `64` = 64, `65` = 65,
`66` = 66, `67` = 67, `68` = 68, `69` = 69, `70` = 70, `71` = 71,
`72` = 72, `73` = 73, `74` = 74, `75` = 75, `76` = 76, `77` = 77,
`78` = 78, `79` = 79, `80` = 80, `81` = 81, `82` = 82, `83` = 83,
`84` = 84, `85` = 85, `86` = 86, `87` = 87, `88` = 88, `89` = 89,
`90` = 90, `91` = 91, `92` = 92, `93` = 93, `94` = 94, `95` = 95,
`96` = 96, `97` = 97, `98` = 98, `99` = 99, `100` = 100, 101), class = c("haven_labelled",
"vctrs_vctr", "double"))), row.names = c(NA, -11L), class = c("tbl_df",
"tbl", "data.frame"))
答案 0 :(得分:0)
我找到了一种使用 labelled
包的解决方案。
for(i in col) {
print(var_label(dat[i]))
print(table(dat[i], useNA = "always"))
}