我正在处理包含值标签的调查数据。避免包允许用数据标签属性导入数据。有时这些价值标签需要以常规方式进行编辑。
我在这里给出的示例非常简单,但我正在寻找一种可以应用于跨大型data.frames的类似问题的解决方案。
d <- dput(structure(list(var1 = structure(c(1, 2, NA, NA, 3, NA, 1, 1), labels = structure(c(1,
2, 3, 8, 9), .Names = c("Protection of environment should be given priority",
"Economic growth should be given priority", "[DON'T READ] Both equally",
"[DON'T READ] Don't Know", "[DON'T READ] Refused")), class = "labelled")), .Names = "var1", row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame")))
d$var1
<Labelled double>
[1] 1 2 NA NA 3 NA 1 1
Labels:
value label
1 Protection of environment should be given priority
2 Economic growth should be given priority
3 [DON'T READ] Both equally
8 [DON'T READ] Don't Know
9 [DON'T READ] Refused
如果值标签以“[DO NOT READ]”开头,我想从标签的开头删除“[DO NOT READ]”并在末尾添加“(VOL)”。因此,“[请勿阅读]两者同等”现在将读作“两者均等(VOL)。”
当然,使用来自避风港相关标记包的函数编辑此单个变量非常简单。但我想在data.frame中的所有变量中应用此解决方案。
library(labelled)
val_labels(d$var1) <- c("Protection of environment should be given priority" = 1,
"Economic growth should be given priority" = 2,
"Both equally (VOL)" = 3,
"Don't Know (VOL)" = 8,
"Refused (VOL)" = 9)
如何以可应用于data.frame中每个变量的方式直接实现上述函数的结果?
解决方案必须,无论 具体值。 (在这种情况下,值为3,8,&amp; 9需要更改,但情况不一定如此)。
答案 0 :(得分:1)
有几种方法可以做到这一点。您可以使用lapply()
或(如果需要衬套),也可以使用mutate()
的任何范围变体:
lapply()
此方法使用gsub()
遍历所有列,以删除不需要的部分,并将" (VOL)"
添加到字符串的末尾。当然,您也可以将其与子集一起使用!
d[] <- lapply(d, function(x) {
labels <- attributes(x)$labels
names(labels) <- gsub("\\[DON'T READ\\]\\s*(.*)", "\\1 (VOL)", names(labels))
attributes(x)$labels <- labels
x
})
d$var1
[1] 1 2 NA NA 3 NA 1 1
attr(,"labels")
Protection of environment should be given priority Economic growth should be given priority
1 2
Both equally (VOL) Don't Know (VOL)
3 8
Refused (VOL)
9
attr(,"class")
[1] "labelled"
mutate_all()
使用相同的逻辑(结果相同),您可以以更整齐的方式更改标签的名称:
d %>%
mutate_all(~{names(attributes(.)$labels) <- gsub("\\[DON'T READ\\]\\s*(.*)", "\\1 (VOL)", names(attributes(.)$labels));.}) %>%
map(attributes) # just to check on the result