我目前的设置使用R 3.4.2和tidyverse 1.1.1。
我的目标是以this answer的方式转换数据,但是以可扩展的方式进行转换,这样我就可以轻松地更改我希望执行此操作的变量集。
为了明确起见,让我们采取以下数据:
library(tidyverse)
df = tibble(
id = seq(1,8),
hair.colour = c("red", "blonde", "brown", "black", "red", "blonde", "brown", "black"),
eye.colour = c("blue", "brown", "blue", "brown", "blue", "brown", "blue", "brown"),
gender = c("male", "male", "male", "male", "female", "female", "female",
"female"))
这样的代码可以按照需要运行:
df2 = df %>%
mutate(value = 1,
hair.colour = paste("hair.colour", hair.colour, sep = ".")) %>%
spread(hair.colour, value, fill = 0)
天真地尝试将其包裹在一个循环中,例如
factors = c("hair.colour", "eye.colour", "gender")
for (factor in factors) {
df = df %>%
mutate(value = 1, factor = paste(toString(factor), factor, sep = ".")) %>%
spread(factor, value, fill = 0)
}
不起作用。我想有一个聪明的方法使用quo(),!!等,但我是R的新手,我的搜索没有产生任何我可以使用的。
有没有人在tidyverse中有任何建议(特别是如果它找到了一种方法来使用与第二个块中相同的代码)并且在它之外?
答案 0 :(得分:0)
你可以这样做:
factors = c("hair.colour", "eye.colour", "gender")
for (factor in factors) {
df = df %>%
mutate(value = 1, x = paste(factor,.[[factor]], sep = ".")) %>%
select_(paste0("-",factor)) %>%
spread(x, "value", fill = 0)
}
点.
是使用管道时左侧的快捷方式,因此在键入.[[factor]]
时,我可以写df[[factor]]
一样,所以我粘贴了值您的因子字符串与相关列的值。
select_
是select
的变体,使用标准评估(基本上你给它提供字符串),dplyr和tidyr函数通常有一个。更多:?select_
结果:
# # A tibble: 8 x 9
# id hair.colour.black hair.colour.blonde hair.colour.brown hair.colour.red eye.colour.blue eye.colour.brown gender.female gender.male
# * <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 0 0 0 1 1 0 0 1
# 2 2 0 1 0 0 0 1 0 1
# 3 3 0 0 1 0 1 0 0 1
# 4 4 1 0 0 0 0 1 0 1
# 5 5 0 0 0 1 1 0 1 0
# 6 6 0 1 0 0 0 1 1 0
# 7 7 0 0 1 0 1 0 1 0
# 8 8 1 0 0 0 0 1 1 0
答案 1 :(得分:0)
正如@aosmith指出的那样,select_
已被弃用,您可能想要一个更灵活的解决方案,您可以尝试
df %>%
# make data long
gather(key = key, value = value, -id) %>%
# unite columns
unite(col = new_key, key, value, sep = ".") %>%
# add column with 1 for indication when back to wide
mutate(new_value = 1,
# this is only needed if you want to keep the order of the variables:
new_key = factor(new_key, levels = unique(new_key))) %>%
# transform back to wide, fill NAs with 0
spread(key = new_key, value = new_value, fill = 0)