purrr:加入嵌套在不同列表列中的小标题

时间:2019-02-18 17:08:50

标签: r purrr

这本质上是对@keqiang-li之前的one的后续问题。

我有一个数据框,其中包含一个列表列(嵌套的数据框),该列表列包含政府当事方及其各自席位的数量。此数据框按国家/地区划分(请注意,我使用了新的dplyr 0.8 group_nestgroup_split)。

我本质上想做的是获得另一个列表列,该列为每个政府提供每个以前的政府的列表,其中包含一个表明政党和议席重叠的数据框。

library(tidyverse)


df <- tibble::tribble(
  ~period, ~party, ~seats,
  1,    "A",      2,
  1,    "B",      3,
  1,    "C",      3,
  2,    "A",      2,
  2,    "C",      3,
  3,    "C",      4,
  3,    "E",      1,
  3,    "F",      3
)

df <- bind_rows(AA=df, BB=df, .id="country")

df <- df %>% 
  group_by(country, period) %>% 
  group_nest() %>% 
  #mutate(gov=map(data, "party") %>% map(.,list)) %>% 
  mutate(prev.govs=map(data, "party") %>% 
           map(., list) %>%
           accumulate(.,union))

df <- df %>% 
  group_split(country) %>% 
  map(., ~mutate(., prev.govs.df=map_depth(prev.govs, 2, enframe, value="party")))

df是我的出发点。在失败的尝试之下。

##attempts
df %>% 
  map(., ~mutate(., df.overlap=map_depth(prev.govs.df, 3, ~map2(., data, inner_join))))
#> Error in UseMethod("inner_join"): nicht anwendbare Methode für 'inner_join' auf Objekt der Klasse "c('integer', 'numeric')" angewendet

df %>% 
  map(., ~mutate(., df.overlap=map_depth(prev.govs.df, 2, ~map2(., data, inner_join))))
#> Error: Mapped vectors must have consistent lengths:
#> * `.x` has length 2
#> * `.y` has length 3

df %>% 
  map(., ~mutate(., df.overlap=map2(data, prev.govs.df, ~map2(.x, .y, ~map2(.x, .y, inner_join)))))
#> Error: Mapped vectors must have consistent lengths:
#> * `.x` has length 3
#> * `.y` has length 2

在更具体的水平上,时段3中country AA的解决方案将是3个列表,每个列表都有一个小标题,其中包含来自data的行与与prev.govs.def中的行重叠的行party列(键)

df[[1]][["prev.govs.df"]][[3]] 
#> [[1]]
#> # A tibble: 3 x 2
#>    name party
#>   <int> <chr>
#> 1     1 A    
#> 2     2 B    
#> 3     3 C    
#> 
#> [[2]]
#> # A tibble: 2 x 2
#>    name party
#>   <int> <chr>
#> 1     1 A    
#> 2     2 C    
#> 
#> [[3]]
#> # A tibble: 3 x 2
#>    name party
#>   <int> <chr>
#> 1     1 C    
#> 2     2 E    
#> 3     3 F
df[[1]][["data"]][[3]]
#> # A tibble: 3 x 2
#>   party seats
#>   <chr> <dbl>
#> 1 C         4
#> 2 E         1
#> 3 F         3

先前问题的答案解决了谜语如何使两个列表相交。不幸的是,我无法弄清楚下一步如何拆分数据框并合并嵌套的小对象。

非常有用!

2 个答案:

答案 0 :(得分:2)

一个原因是length元素的list中存在差异的问题。我们可以rep组合一个list元素以使长度相同,然后执行inner_join

out <- df %>%
         map(., ~ .x %>% 
             mutate(df.overlap = map2(prev.govs.df, data, ~ 
                map2(rep(list(.y), length(.x)), .x, inner_join))))

-输出

out[[1]]
# A tibble: 3 x 6
#  country period data             prev.govs  prev.govs.df df.overlap
#  <chr>    <dbl> <list>           <list>     <list>       <list>    
#1 AA           1 <tibble [3 × 2]> <list [1]> <list [1]>   <list [1]>
#2 AA           2 <tibble [2 × 2]> <list [2]> <list [2]>   <list [2]>
#3 AA           3 <tibble [3 × 2]> <list [3]> <list [3]>   <list [3]>

# overlap column element
out[[1]]$df.overlap[[3]][[1]]
# A tibble: 1 x 3
#  party seats  name
#  <chr> <dbl> <int>
#1 C         4     3


# input dataset elements used for joining
out[[1]]$data[[3]]
# A tibble: 3 x 2
#  party seats
#  <chr> <dbl>
#1 C         4
#2 E         1
#3 F         3

out[[1]]$prev.govs.df[[3]][[1]]
# A tibble: 3 x 2
#   name party
#  <int> <chr>
#1     1 A    
#2     2 B    
#3     3 C    

答案 1 :(得分:1)

OP的第三次尝试实际上已经接近完成。我们只需要像下面这样修改最后一个map

library(tidyverse)

output <- df %>%
  map(~mutate(., df.overlap = map2(data, prev.govs.df, ~map(.y, inner_join, .x))))

输出:

[[1]]
# A tibble: 3 x 6
  country period data             prev.govs  prev.govs.df df.overlap
  <chr>    <dbl> <list>           <list>     <list>       <list>    
1 AA           1 <tibble [3 x 2]> <list [1]> <list [1]>   <list [1]>
2 AA           2 <tibble [2 x 2]> <list [2]> <list [2]>   <list [2]>
3 AA           3 <tibble [3 x 2]> <list [3]> <list [3]>   <list [3]>

[[2]]
# A tibble: 3 x 6
  country period data             prev.govs  prev.govs.df df.overlap
  <chr>    <dbl> <list>           <list>     <list>       <list>    
1 BB           1 <tibble [3 x 2]> <list [3]> <list [3]>   <list [3]>
2 BB           2 <tibble [2 x 2]> <list [3]> <list [3]>   <list [3]>
3 BB           3 <tibble [3 x 2]> <list [3]> <list [3]>   <list [3]>

> output[[1]]$df.overlap[[3]]
[[1]]
# A tibble: 1 x 3
   name party seats
  <int> <chr> <dbl>
1     3 C         4

[[2]]
# A tibble: 1 x 3
   name party seats
  <int> <chr> <dbl>
1     2 C         4

[[3]]
# A tibble: 3 x 3
   name party seats
  <int> <chr> <dbl>
1     1 C         4
2     2 E         1
3     3 F         3