跨嵌套数据框加入/嵌套数据框

时间:2017-07-12 15:44:52

标签: r tidyverse

我正在从模型中提取信息以进行最终的绘图。我想要的图是抖动的原始数据,带有平均值+/- STDERR和文本分组的叠加。模型输出将分组和估计放在列表中的单独数据帧中。我使用地图来提取它们并且它有效,但是我坚持将它们连接在一起的步骤。

我想将两个嵌套的list-cols连接到一个表中,并将该结果嵌套为一个新列。我现在能做的最好的事情就是取消,连接表,再次嵌套,然后再加入原始的嵌套表。

library(agricolae)
library(tidyverse)

fitHSD2<- function(d) HSD.test(aov(mpg ~ cyl, data= d), trt = "cyl")     # anova with Tukey HSD

carnestdf <-
    mtcars %>%
        group_by(gear) %>%
        nest() %>%
        mutate(mod = map(data, fitHSD2) # fit model
                        , estimates = map(mod, function(df) return(df$means)) # pull out estimates and StdErr
                        , estimates = map(estimates, function(df) return(rownames_to_column(df, var = "trt"))) #attach rownames as column for unnest
                        , grouping = map(mod, function(df) return(df$groups)) # pull out groupings
                        , grouping = map(grouping, function(df) mutate(df, trt = as.character(trt) # convert to character
                                                                                                        , trt = gsub("[[:space:]]*$", "", trt)
                                                                                                        , M = as.character(M)
                                                                                                        )
                                                    ) # remove whitespace at end for join
                        ) 

carnestdf

我可以删除每一个并加入它们,但我无法嵌套并加入它们。我实际上可以......只需要定义连接键,否则它会尝试基于嵌套的DF加入,并且在没有下面的散列的情况下不起作用。

full_join(unnest(carnestdf , estimates), unnest(carnestdf , grouping)) %>%
group_by(gear) %>%
nest(.key = "estgrp") %>%
full_join(carnestdf, ., by = "gear")

我发现了这个:R: Join two tables (tibbles) by *list* columns

但它似乎没有用,我在使用哈希加入时得到了同样的错误。它确实有效,需要在nest中定义.key所以它不是&#39 ; t&#34;数据&#34;。仍然宁愿加入而不必去除......:/

nestmerge <-
    full_join(unnest(carnestdf , estimates), unnest(carnestdf , grouping)) %>%
    group_by(gear) %>%
    nest(.key = "mergedestgrp") %>%
    mutate_all(funs(hash = map_chr(., digest::digest)))

carnestdf %>%
    mutate_all(funs(hash = map_chr(., digest::digest))) %>%
    full_join(., nestmerge) %>%
    select(-ends_with("hash"))

1 个答案:

答案 0 :(得分:0)

答案显然是map2:

carnestdf <-
    mtcars %>%
        group_by(gear) %>%
        nest() %>%
        mutate(mod = map(data, fitHSD2) # fit model
                        , estimates = map(mod, function(df) return(df$means)) # pull out estimates and StdErr
                        , estimates = map(estimates, function(df) return(rownames_to_column(df, var = "trt"))) #attach rownames as column for unnest
                        , grouping = map(mod, function(df) return(df$groups)) # pull out groupings
                        , grouping = map(grouping, function(df) mutate(df, trt = as.character(trt) # convert to character
                                                                                                        , trt = gsub("[[:space:]]*$", "", trt)
                                                                                                        , M = as.character(M)
                                                                                                        )
                                                    ) # remove whitespace at end for join
                        , estgrp = map2(estimates, grouping, ~full_join(.x, .y, by = "trt"))
                        ) 

carnestdf

这通过“trt”对两个表进行完全连接,并使用结果创建一个新的列表列。