如何在四个不同的数据帧中添加相同列名的值,但列数不同

时间:2018-08-16 21:40:59

标签: r dataframe

我需要在R中的四个不同数据帧中添加相同列名的值。问题是这4个数据帧中的列数不同,其中只有一个数据帧包含所有列。其余数据帧具有第一个数据帧的列名称的子集。 4个数据帧中的行数相等。

最小的可复制示例是:

说有4个数据帧,其结构如下:

df1 <- setNames(data.frame(matrix(ncol = 10, nrow = 900)), c("Red", "Blue", "Yellow", "Green", "Orange", "Pink", "Brown", "Black", "Grey", "Purple"))
df2 <- setNames(data.frame(matrix(ncol = 9, nrow = 900)), c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Grey", "Purple"))
df3 <- setNames(data.frame(matrix(ncol = 8, nrow = 900)), c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Purple"))
df4 <- setNames(data.frame(matrix(ncol = 6, nrow = 900)), c("Red", "Yellow", "Green", "Orange", "Brown", "Purple")

假定四个数据帧中的这些列中的每一列在900行中都具有整数值。如何返回一个数据帧,该数据帧基本上是四个数据帧中相同列的值相加? 换句话说,df.sum[1:10] <- df1[1:10] + df2[1:9] + df3[1:8] + df4[1:6],但是在添加时标识要添加的相同列

1 个答案:

答案 0 :(得分:1)

如果没有NA元素,则可以在使尺寸相同之后执行+

lst <- mget(paste0("df", 1:4)) # get the datasets in a list
nm1 <- Reduce(union, lapply(lst, names)) # find all the column names
# assign missing columns in each of the dataset with value 0
# get the `+` of all list elements with Reduce
dfout <- Reduce(`+`, lapply(lst, function(x) {
        x[setdiff(nm1, names(x))] <- 0
        x[nm1]}))
dim(dfout)
#[1] 900  10

数据

set.seed(24)
df1 <- setNames(data.frame(matrix(rnorm(900 * 10), ncol = 10, nrow = 900)), 
    c("Red", "Blue", "Yellow", "Green", "Orange", "Pink", 
  "Brown", "Black", "Grey", "Purple"))
df2 <- setNames(data.frame(matrix(rnorm(900 * 9), ncol = 9, nrow = 900)), 
   c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown",
        "Black", "Grey", "Purple"))
df3 <- setNames(data.frame(matrix(rnorm(900 * 8), ncol = 8, nrow = 900)), 
      c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Purple"))
df4 <- setNames(data.frame(matrix(rnorm(900 * 6), ncol = 6, nrow = 900)),
     c("Red", "Yellow", "Green", "Orange", "Brown", "Purple"))