包含具有相同名称的列的数据框列表

时间:2016-06-30 17:37:56

标签: r dataframe

转换一些JSON数据后,我有一个数据帧列表。一些数据框包含具有相同名称的列:

str(json)
List of 2
$ :'data.frame':    1 obs. of  2 variables:
..$ a                    :Factor
.. ..- attr(*, "names")= chr "a"
..$ b                    :Factor
.. ..- attr(*, "names")= chr "b"
$ :'data.frame':    1 obs. of  3 variables:
..$ a                    :Factor
.. ..- attr(*, "names")= chr "a"
..$ b                    :Factor
.. ..- attr(*, "names")= chr "b"
..$ b
.. ..- attr(*, "names")= chr "b"

谁的价值观是:

json[[1]]                    json[[2]]
a    b                       a     b     b
car  boat                    bus   plane  train

我尝试使用以下方法在一个唯一的数据框中转换整个列表:

 data <- rbind.fill(json)

但是,只考虑共享名称的第一列:

data
     a    b
1   car  boat
2   bus plane

我想获得这样的数据框:

data
    a    b
1  car  boat
2  bus  plane,train 

我怎么能合并这样的列?

我无法在实践中重现一个最小的例子,因为R不允许我创建两个具有相同名称的列(如Shorpy的答案所示),而我的实际列表包含数百列。但是,我认为输出输出可以减少到这样:

dput(json)
list(structure(list(`a` = structure(1L, .Names = "a", .Label = "car", class = "factor"), 
    `b` = structure(1L, .Names = "b", .Label = "boat", class = "factor") , .Names = c("a","b"), row.names = c(NA, -1L), class = "data.frame"),
 structure(list(`a` = structure(1L, .Names = "a", .Label = "bus", class = "factor"),
    `b` = structure(1L, .Names = "b", .Label = "plane", class = "factor"),
    `b` = structure(1L, .Names = "b", .Label = "train", class = "factor"), .Names = c("a","b","b"), row.names = c(NA, -1L), class = "data.frame"))

我试图通过更改其名称来区分列,如另一个问题中所述:change column names with same name in dataframe in R。但是,在同一数据帧中可能会重复多个名称,并且重复的名称在不同的数据帧中不同,这使得它更复杂。

1 个答案:

答案 0 :(得分:2)

我会尝试这样的事情:

library(purrr)
library(dplyr)
l <- list()
l[[1]] <- data_frame(a = "car", b = "boat")
l[[2]] <- data_frame(a = "car", b = "plane", b = "train")


recode <- function(df){

  copies <- df[, names(df) == "b"]
  out <- data_frame(a = df$a,
                    b = reduce(copies, paste))
  out
}

map_df(l, recode)