在嵌套数据框上使用purrr绘制数据

时间:2018-07-17 03:19:13

标签: r ggplot2 purrr

我是purrr包的新手,正在尝试遍历数据帧中的每个组以:

  1. 将变量(Sepal.Length)的值设置为值xmax(例如5),该值基于该组数据的分位数

  2. x-axis标签设置为例如0,1,2,3,4,> = 5

我有一种工作方法,但无法完成以下工作(请注意,由于@Jimbou的评论,已对此进行了编辑)。创建了列xmaxxbreaksxlabels,但是Sepal.Length是新列,我想更新data$Sepal.Length

binwidth <- 1    
graphs <- as_tibble(iris) %>% 
  nest(-Species) %>%
  mutate(xmax = map(data, ~ plyr::round_any(quantile(.$Sepal.Length, 0.975), binwidth)),
         xbreaks =  map(xmax, ~ seq(0, ., binwidth)),
         xlabels =  map(xmax, ~c(seq(0, (. - binwidth), binwidth), paste0(">=", .))),

        Sepal.Length= map2(data, xmax, ~ ifelse(.x$Sepal.Length >= .y, .y, .x$Sepal.Length)),  
        # this creates a new column, want it instead to update column in data
        # a work-around would be to create a dataframe from the new column
        # but I would like to work out how to update columns ... 

        graphs = map2(data, Species, ~ ggplot(., aes(Sepal.Length))) + 
           geom_histogram() + 
           scales_x_continuous(breaks=xbreaks, labels = xlabels) + 
           ggtitle(.y)
  )

感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

此方法有效,但不能回答OP有关如何更新嵌套数据框中的列的问题。

binwidth <- 1      
graphs <- as_tibble(iris) %>% 
  nest(-Species) %>%
  mutate(graphs = map2(
    data, 
    Species,
    function(.x, .y) 
    {
      xmax <- plyr::round_any(quantile(.x$Sepal.Length, 0.975), binwidth)
      xbreaks <- seq(0, xmax, binwidth)
      xlabels =  c(seq(0, (xmax - binwidth), binwidth), paste0(">=", xmax))
      .x$Sepal.Length = ifelse(.x$Sepal.Length >= xmax, xmax, .x$Sepal.Length)
      graphs = ggplot(.x, aes(Sepal.Length)) + geom_histogram(binwidth = binwidth) + scale_x_continuous(breaks = xbreaks, labels=xlabels) + ggtitle(.y)
      }
  )
  )
invisible(lapply(graphs$graphs, print))

感谢@Jimbou提供使用invisible()的提示