我说有R中的df列表
list.data<-list(df1=df1,df2=df2)
所有df
,具有相同的行数和相同的列数
我有一个由TRUE / FALSE值组成的矩阵m
。假设df是
[,1] [,2]
[1,] -1.8526984 -1.3359316
[2,] -0.9391172 -1.4453051
[3,] 0.2793443 -1.0223621
[4,] 2.0174213 -1.1734235
[5,] 0.2100461 -0.1261543
而df2是
[,1] [,2]
[1,] -1.8526984 0.1956987
[2,] 0.1737456 -1.4453051
[3,] 1.7133539 0.4562011
[4,] -0.6132369 -0.3532976
[5,] -0.5008479 1.5729352
我的矩阵m
是
[,1] [,2]
[1,] FALSE TRUE
[2,] TRUE FALSE
[3,] TRUE TRUE
[4,] TRUE TRUE
[5,] TRUE TRUE
我想将df
对象中包含的list.data
组合成一个数据帧,仅取矩阵{{1}标记为TRUE的第i行和第j列中的元素的平均值},同时保持数据框的其他元素不变。
Ex:最终数据帧应为5 x 2矩阵,例如(2,1)元素应为df2_(2,1)和df1_(2,1)之间的平均值,因为m_(2,1)是真的。因为m(1,1)为FALSE,所以1,1元素应为df1_(1,1)或df_2(1,1)
谢谢
答案 0 :(得分:3)
您似乎有矩阵列表。我们可以做到
#Create a matrix to hold the result
result <- matrix(0, ncol = ncol(m), nrow = nrow(m))
#Find indices to calculate mean
inds <- which(m)
#Indices for which the values is to be taken as it is
non_inds <- which(!m)
#Subset the indices from list of matrices and take their mean
result[inds] <- rowMeans(sapply(list.data, `[`, inds))
#Subset the indices from first list as it is
result[non_inds] <- list.data[[1]][non_inds]
result
# [,1] [,2]
#[1,] -1.8526984 -0.5701164
#[2,] -0.3826858 -1.4453051
#[3,] 0.9963491 -0.2830805
#[4,] 0.7020922 -0.7633606
#[5,] -0.1454009 0.7233905
数据
list.data <- list(df1 = structure(c(-1.8526984, -0.9391172, 0.2793443,
2.0174213,
0.2100461, -1.3359316, -1.4453051, -1.0223621, -1.1734235, -0.1261543
), .Dim = c(5L, 2L), .Dimnames = list(NULL, c("V1", "V2"))),
df2 = structure(c(-1.8526984, 0.1737456, 1.7133539, -0.6132369,
-0.5008479, 0.1956987, -1.4453051, 0.4562011, -0.3532976,
1.5729352), .Dim = c(5L, 2L), .Dimnames = list(NULL, c("V1",
"V2"))))
答案 1 :(得分:2)
这是不初始化矩阵的一种选择
out <- Reduce(`+`, lapply(list.data, function(x) x * NA^!m ))/2
replace(out, is.na(out), list.data[[1]][is.na(out)])
# V1 V2
#[1,] -1.8526984 -0.5701164
#[2,] -0.3826858 -1.4453051
#[3,] 0.9963491 -0.2830805
#[4,] 0.7020922 -0.7633606
#[5,] -0.1454009 0.7233905
或与coalesce
library(dplyr)
coalesce(Reduce(`+`, lapply(list.data, function(x) x * NA^!m ))/2, list.data[[1]])
或在管道中相同
library(tidyverse)
library(magrittr)
map(list.data, ~ .x * NA^ !m ) %>%
reduce(`+`) %>%
divide_by(2) %>%
coalesce(list.data[[1]])
list.data <- list(df1 = structure(c(-1.8526984, -0.9391172, 0.2793443,
2.0174213,
0.2100461, -1.3359316, -1.4453051, -1.0223621, -1.1734235, -0.1261543
), .Dim = c(5L, 2L), .Dimnames = list(NULL, c("V1", "V2"))),
df2 = structure(c(-1.8526984, 0.1737456, 1.7133539, -0.6132369,
-0.5008479, 0.1956987, -1.4453051, 0.4562011, -0.3532976,
1.5729352), .Dim = c(5L, 2L), .Dimnames = list(NULL, c("V1",
"V2"))))
m <- structure(c(FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE,
TRUE, TRUE), .Dim = c(5L, 2L))