旋转数据框列表并合并它们

时间:2020-01-05 23:33:48

标签: r list pivot tidyverse

我有3个数据框的列表,这些数据框共享一些行和列

数据

all_data <- list(questionnaireA = structure(list(name = structure(2:1, .Label = c("James", 
          "Shawn"), class = "factor"), banana = c(1, 0), grapes = c(1, 
          1), orange = c("AB", 1)), class = "data.frame", row.names = c(NA, 
          -2L)), questionnaireB = structure(list(name = structure(2:1, .Label = c("Chris", 
          "James"), class = "factor"), orange = c(1, 0), banana = c(1, 
          0)), class = "data.frame", row.names = c(NA, -2L)), questionnaireC = structure(list(
          name = structure(3:1, .Label = c("Donald", "James", "Shawn"
          ), class = "factor"), banana = c(1, 0, 0), raisins = c(1, 
          1, 1), grapes = c(1, 1, 0), cake = c(0, 1, 0)), class = "data.frame", row.names = c(NA, -3L)))
$questionnaireA
   name banana grapes orange
1 Shawn      1      1     AB
2 James      0      1      1

$questionnaireB
   name orange banana
1 James      1      1
2 Chris      0      0

$questionnaireC
    name banana raisins grapes cake
1  Shawn      1       1      1    0
2  James      0       1      1    1
3 Donald      0       1      0    0
library(tidyverse)
map(all_data, ~ .x %>%
    pivot_longer(cols=-name, names_to="fruit"))
  1. 不确定是否将值重命名为数据框的名称。
  2. 我不知道如何连接值和将名称水果对合并。

任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:2)

如果我们遵循与OP尝试的方法类似的选项,即将list中的每个数据集整形为'long'格式,然后与imap循环,创建{{ 1}}元素作为新列,使用list1重塑为“长”格式,然后按组创建序列列,并使用pivot_longer

重塑为“宽”格式
pivot_wider

或者通过使用library(dplyr) library(tidyr) library(purrr) imap_dfr(all_data, ~ .x %>% mutate(grp = .y) %>% pivot_longer(cols = -c(name, grp), names_to = "fruit", values_to = "Value")) %>% #group_by(name, grp, fruit) %>% #mutate(rn = row_number()) %>% pivot_wider(names_from = grp, values_from = Value) 将所有数据集绑定到单个数据来更有效地执行此操作,在执行bind_cols的同时使用pivot_longer删除丢失的值,然后执行与以上解决方案

value_drop_na = TRUE

更新

基于具有混合类型的新数据的列类型,如果我们需要像这样保留“ AB”之类的值,则需要将其转换为bind_rows(all_data, .id = 'grp') %>% pivot_longer(cols = c(-name, -grp), names_to = "fruit", values_to = "Value", values_drop_na = TRUE) %>% # sequence column creation is not really required for the example # as there are no duplicates #group_by(name, grp, fruit) %>% #mutate(rn = row_number()) %>% pivot_wider(names_from = grp, values_from = Value)

character

或者是一种类似于之前的imap_dfr(all_data, ~ .x %>% mutate_at(-1, as.character) %>% mutate(grp = .y) %>% pivot_longer(cols = -c(name, grp), names_to = "fruit", values_to = "Value")) %>% pivot_wider(names_from = grp, values_from = Value) 的有效方法(但由于列类型不同,因此无法在此处完成)

bind_rows