我正在尝试使用tidyverse进行一些探索性数据分析。我有一个庞大而复杂的数据集,但重要的部分归结为类似于以下内容:
my_df <- data.frame(Expt = rep(c("Expt1", "Expt2", "Expt3", "Expt4"), each = 96),
ExpType = rep(c("A", "B"), each = 192),
Treatment = c(rep("T1", 192), rep("T2", 144), rep("T1", 48)),
Subject = c(rep(c("S01", "S02", "S03", "S04", "S05", "S06", "S07", "S08"), 24), rep("S01", 96), rep("S06", 96)),
xvar = as.factor(rep(rep(c(10, 5, 2.5, 1.25, 0.6, 0.3, 0.16, 0.08, 0.04, 0.02, 0, "NA"), each = 8), 4)),
yvar = runif(384))
(Expt是每个实验的唯一但无信息的标识符。每个Expt始终只有一个ExpType,但可能包含一个或多个级别的“治疗和受试者”。)
我要按ExpType,Treatment,Subject和Expt对数据进行分组,然后制作图表。因此,我正在制作大量图表,如果这些图表具有内容丰富的标题,那将使我的生活容易得多。
我可以对数据进行分组并制作所有图形,就像这样:
my_df2 <- my_df %>%
group_by(ExpType, Treatment, Expt) %>%
nest() %>%
mutate(plots1 = map(
.x = data,
~ggplot(data = .x, aes(x=as.factor(xvar), y = yvar)) + #
theme_classic() + theme(legend.key.width = unit(2, "lines"), legend.justification = c(1, 1), legend.position = c(1, 1)) +
geom_smooth(method = "loess", se = FALSE, aes(group=Subject, color=Subject, linetype = Subject))+
geom_point(aes(fill = Subject), size = 2.5)
))
walk(.x = my_df2$plots1, ~print(.x))
我不知道该怎么做,就是在每个图上添加一个标题,以告诉我它是什么。我尝试制作一个包含所有相关信息的唯一标识符:
my_df3 <- my_df %>%
mutate(FullID = paste0(my_df$ExpType, "_", my_df$Treatment, "_", my_df$Expt)) %>%
group_by(ExpType, Treatment, Expt) %>%
nest() %>%
arrange(ExpType, Treatment)
然后我可以再次获取FullID:
# Either of these will successfully extract a list of FullIDs
map(my_df3$data, "FullID")
my_df3$data %>%
map("FullID")
我不知道该怎么做,就是降低地图上的嵌套层次(〜ggplot调用以使用FullID作为绘图标题,例如:
my_df3 <- my_df3 %>%
mutate(plots2 = map2(
.x = data,
.y = map_chr(data$FullID),
~ggplot(.x, aes(x=xvar, y = yvar)) + #
theme_classic() + theme(legend.key.width = unit(2, "lines"), legend.justification = c(1, 1), legend.position = c(1, 1)) +
geom_smooth(method = "loess", se = FALSE, aes(group=Subject, color=Subject, linetype = Subject))+
geom_point(aes(fill=Subject, shape = Subject), size = 2.5) +
labs(title = unique(.y))
))
我知道必须有一种方法可以做到,而我只是不了解语法。有什么建议吗?
答案 0 :(得分:1)
也可以使用FullID
创建unite
(请注意,.$
函数内部不需要dplyr
)。在nest/arrange
之后,在OP的代码中,map2
与一个输入参数一起用作map_chr(data$FullID)
。为了使map
起作用,它需要应用一个不存在的功能(.f
)。同样,当我们从list
列“数据”中的一列中提取信息时。我们不需要map2
,但是只需一个map
,以后便可以提取labs
中的列信息
my_df2 <- my_df %>%
unite(FullID, ExpType, Treatment, Expt, sep="_", remove = FALSE) %>%
group_by(ExpType, Treatment, Expt) %>%
nest %>%
arrange(ExpType, Treatment) %>%
mutate(plots = map(data, ~
ggplot(.x, aes(x=xvar, y = yvar)) +
theme_classic() +
theme(legend.key.width = unit(2, "lines"),
legend.justification = c(1, 1), legend.position = c(1, 1)) +
geom_smooth(method = "loess", se = FALSE,
aes(group=Subject, color=Subject, linetype = Subject))+
geom_point(aes(fill=Subject, shape = Subject), size = 2.5) +
labs(title = first(.x$FullID))))
-检查
my_df2$plots[[1]]