我有一个数据框,每个人只有一行。列是结果变量,然后是该结果的一堆潜在预测变量。作为数据分析的第一步,我想使用ggplot可视化每个预测变量及其与结果的关联。我想要连续变量的直方图和分类的条形图。
我的尝试是
numeric <- c(0,1.1,2.4,3.1,4.0,5.9,4.2,3.3,2.2,1.1)
categorical <- as.factor(c("yes","no","no","yes","yes","no","no","yes","no","no"))
outcome <- as.factor(c("alive","dead","alive","dead","alive","dead","alive","dead","alive","dead"))
df <- data.frame(num = numeric, cat = categorical, outcome = outcome)
predictors <- c("num", "cat")
predictors %>%
walk(print(ggplot(df, aes(x=., fill=outcome)) +
{ifelse(class(.) == "factor", geom_bar(position="fill"), geom_histogram(position="fill", bins=10))}))
但是我得到了错误
Error in rep(no, length.out = length(ans)): attempt to replicate an object of type 'environment'
Traceback:
1. predictors %>% walk(print(ggplot(df, aes(x = ., fill = outcome)) +
. {
. ifelse(class(.) == "factor", geom_bar(position = "fill"),
. geom_histogram(position = "fill", bins = 10))
. }))
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. walk(., print(ggplot(df, aes(x = ., fill = outcome)) + {
. ifelse(class(.) == "factor", geom_bar(position = "fill"),
. geom_histogram(position = "fill", bins = 10))
. }))
10. map(.x, .f, ...)
11. as_mapper(.f, ...)
12. print(ggplot(df, aes(x = ., fill = outcome)) + {
. ifelse(class(.) == "factor", geom_bar(position = "fill"),
. geom_histogram(position = "fill", bins = 10))
. })
13. ifelse(class(.) == "factor", geom_bar(position = "fill"), geom_histogram(position = "fill",
. bins = 10)) # at line 9 of file <text>
我的实际数据集有> 20个预测变量,所以我想用一种很好的方法来生成20个以上的ggplots,理想情况下将其保持在这样的管道格式中,这样一来,一旦绘制图工作,我就可以添加其他步骤。
答案 0 :(得分:2)
这是将predictors
列传递到map
并基于列的class
创建图列表的一种方法。
library(tidyverse)
library(rlang)
p1 <- map(predictors, function(p) if (class(df[[p]]) == "factor")
ggplot(df, aes(x = !!sym(p), fill=outcome)) + geom_bar(position="fill")
else
ggplot(df, aes(x = !!sym(p), fill=outcome)) +
geom_histogram(position="fill", bins=10))
p1[[1]]
p1[[2]]