我是purrr
包的新手,正在尝试遍历数据帧中的每个组以:
将变量(Sepal.Length
)的值设置为值xmax
(例如5),该值基于该组数据的分位数
将x-axis
标签设置为例如0,1,2,3,4,> = 5
我有一种工作方法,但无法完成以下工作(请注意,由于@Jimbou的评论,已对此进行了编辑)。创建了列xmax
,xbreaks
,xlabels
,但是Sepal.Length
是新列,我想更新data$Sepal.Length
。
binwidth <- 1
graphs <- as_tibble(iris) %>%
nest(-Species) %>%
mutate(xmax = map(data, ~ plyr::round_any(quantile(.$Sepal.Length, 0.975), binwidth)),
xbreaks = map(xmax, ~ seq(0, ., binwidth)),
xlabels = map(xmax, ~c(seq(0, (. - binwidth), binwidth), paste0(">=", .))),
Sepal.Length= map2(data, xmax, ~ ifelse(.x$Sepal.Length >= .y, .y, .x$Sepal.Length)),
# this creates a new column, want it instead to update column in data
# a work-around would be to create a dataframe from the new column
# but I would like to work out how to update columns ...
graphs = map2(data, Species, ~ ggplot(., aes(Sepal.Length))) +
geom_histogram() +
scales_x_continuous(breaks=xbreaks, labels = xlabels) +
ggtitle(.y)
)
感谢您的帮助。
答案 0 :(得分:0)
此方法有效,但不能回答OP有关如何更新嵌套数据框中的列的问题。
binwidth <- 1
graphs <- as_tibble(iris) %>%
nest(-Species) %>%
mutate(graphs = map2(
data,
Species,
function(.x, .y)
{
xmax <- plyr::round_any(quantile(.x$Sepal.Length, 0.975), binwidth)
xbreaks <- seq(0, xmax, binwidth)
xlabels = c(seq(0, (xmax - binwidth), binwidth), paste0(">=", xmax))
.x$Sepal.Length = ifelse(.x$Sepal.Length >= xmax, xmax, .x$Sepal.Length)
graphs = ggplot(.x, aes(Sepal.Length)) + geom_histogram(binwidth = binwidth) + scale_x_continuous(breaks = xbreaks, labels=xlabels) + ggtitle(.y)
}
)
)
invisible(lapply(graphs$graphs, print))
感谢@Jimbou提供使用invisible()
的提示