我需要在R中的四个不同数据帧中添加相同列名的值。问题是这4个数据帧中的列数不同,其中只有一个数据帧包含所有列。其余数据帧具有第一个数据帧的列名称的子集。 4个数据帧中的行数相等。
最小的可复制示例是:
说有4个数据帧,其结构如下:
df1 <- setNames(data.frame(matrix(ncol = 10, nrow = 900)), c("Red", "Blue", "Yellow", "Green", "Orange", "Pink", "Brown", "Black", "Grey", "Purple"))
df2 <- setNames(data.frame(matrix(ncol = 9, nrow = 900)), c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Grey", "Purple"))
df3 <- setNames(data.frame(matrix(ncol = 8, nrow = 900)), c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Purple"))
df4 <- setNames(data.frame(matrix(ncol = 6, nrow = 900)), c("Red", "Yellow", "Green", "Orange", "Brown", "Purple")
假定四个数据帧中的这些列中的每一列在900行中都具有整数值。如何返回一个数据帧,该数据帧基本上是四个数据帧中相同列的值相加?
换句话说,df.sum[1:10] <- df1[1:10] + df2[1:9] + df3[1:8] + df4[1:6]
,但是在添加时标识要添加的相同列
答案 0 :(得分:1)
如果没有NA
元素,则可以在使尺寸相同之后执行+
lst <- mget(paste0("df", 1:4)) # get the datasets in a list
nm1 <- Reduce(union, lapply(lst, names)) # find all the column names
# assign missing columns in each of the dataset with value 0
# get the `+` of all list elements with Reduce
dfout <- Reduce(`+`, lapply(lst, function(x) {
x[setdiff(nm1, names(x))] <- 0
x[nm1]}))
dim(dfout)
#[1] 900 10
set.seed(24)
df1 <- setNames(data.frame(matrix(rnorm(900 * 10), ncol = 10, nrow = 900)),
c("Red", "Blue", "Yellow", "Green", "Orange", "Pink",
"Brown", "Black", "Grey", "Purple"))
df2 <- setNames(data.frame(matrix(rnorm(900 * 9), ncol = 9, nrow = 900)),
c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown",
"Black", "Grey", "Purple"))
df3 <- setNames(data.frame(matrix(rnorm(900 * 8), ncol = 8, nrow = 900)),
c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Purple"))
df4 <- setNames(data.frame(matrix(rnorm(900 * 6), ncol = 6, nrow = 900)),
c("Red", "Yellow", "Green", "Orange", "Brown", "Purple"))