我有3个数据框的列表,这些数据框共享一些行和列
数据
all_data <- list(questionnaireA = structure(list(name = structure(2:1, .Label = c("James",
"Shawn"), class = "factor"), banana = c(1, 0), grapes = c(1,
1), orange = c("AB", 1)), class = "data.frame", row.names = c(NA,
-2L)), questionnaireB = structure(list(name = structure(2:1, .Label = c("Chris",
"James"), class = "factor"), orange = c(1, 0), banana = c(1,
0)), class = "data.frame", row.names = c(NA, -2L)), questionnaireC = structure(list(
name = structure(3:1, .Label = c("Donald", "James", "Shawn"
), class = "factor"), banana = c(1, 0, 0), raisins = c(1,
1, 1), grapes = c(1, 1, 0), cake = c(0, 1, 0)), class = "data.frame", row.names = c(NA, -3L)))
$questionnaireA name banana grapes orange 1 Shawn 1 1 AB 2 James 0 1 1 $questionnaireB name orange banana 1 James 1 1 2 Chris 0 0 $questionnaireC name banana raisins grapes cake 1 Shawn 1 1 1 0 2 James 0 1 1 1 3 Donald 0 1 0 0
library(tidyverse)
map(all_data, ~ .x %>%
pivot_longer(cols=-name, names_to="fruit"))
任何帮助将不胜感激!
答案 0 :(得分:2)
如果我们遵循与OP尝试的方法类似的选项,即将list
中的每个数据集整形为'long'格式,然后与imap
循环,创建{{ 1}}元素作为新列,使用list1
重塑为“长”格式,然后按组创建序列列,并使用pivot_longer
pivot_wider
或者通过使用library(dplyr)
library(tidyr)
library(purrr)
imap_dfr(all_data, ~
.x %>%
mutate(grp = .y) %>%
pivot_longer(cols = -c(name, grp),
names_to = "fruit", values_to = "Value")) %>%
#group_by(name, grp, fruit) %>%
#mutate(rn = row_number()) %>%
pivot_wider(names_from = grp, values_from = Value)
将所有数据集绑定到单个数据来更有效地执行此操作,在执行bind_cols
的同时使用pivot_longer
删除丢失的值,然后执行与以上解决方案
value_drop_na = TRUE
基于具有混合类型的新数据的列类型,如果我们需要像这样保留“ AB”之类的值,则需要将其转换为bind_rows(all_data, .id = 'grp') %>%
pivot_longer(cols = c(-name, -grp), names_to = "fruit",
values_to = "Value", values_drop_na = TRUE) %>%
# sequence column creation is not really required for the example
# as there are no duplicates
#group_by(name, grp, fruit) %>%
#mutate(rn = row_number()) %>%
pivot_wider(names_from = grp, values_from = Value)
类
character
或者是一种类似于之前的imap_dfr(all_data, ~
.x %>%
mutate_at(-1, as.character) %>%
mutate(grp = .y) %>%
pivot_longer(cols = -c(name, grp), names_to = "fruit",
values_to = "Value")) %>%
pivot_wider(names_from = grp, values_from = Value)
的有效方法(但由于列类型不同,因此无法在此处完成)
bind_rows