我有一个名为my_list的数据帧列表。以下是my_list中数据框的示例。
> print(df1)
A B Names
1 0.8262825 0.734412 Baseline
2 1.0100000 0.734412 Sample1
3 0.8262825 0.734412 Sample2
4 1.0100000 0.734412 Sample3
5 0.8262825 0.734412 Sample4
6 1.0100000 0.734412 Sample5
7 0.8262825 0.734412 Sample6
8 1.0100000 0.734412 Sample7
9 0.8262825 0.734412 Sample8
10 1.0100000 0.734412 Sample9
11 0.8262825 0.734412 Sample10
12 1.0100000 NA AASHTO
我想向my_list中每个包含A和B列平均值的数据帧添加一个新行,但在Names列中包含“ Baseline”和“ AASHTO”的行除外。 (因此,只有Sample1到Sample10的行的平均值)
最后,我想将“名称”列设置为my_list中每个数据框的行名称,并从列表中的所有数据框中删除“名称”列。
my_list中每个数据框的预期结果将是
A B
Baseline 0.8262825 0.734412
Sample1 1.0100000 0.734412
Sample2 0.8262825 0.734412
Sample3 1.0100000 0.734412
Sample4 0.8262825 0.734412
Sample5 1.0100000 0.734412
Sample6 0.8262825 0.734412
Sample7 1.0100000 0.734412
Sample8 0.8262825 0.734412
Sample9 1.0100000 0.734412
Sample10 0.8262825 0.734412
Mean 0.8156500 0.734412
AASHTO 1.0100000 NA
非常感谢您的帮助。
答案 0 :(得分:2)
我们可以在list
和lapply
之间循环,获得列“ A”,“ B”的colMeans
,但其中“名称”是“基准”或“ AASHTO”,然后rbind
和原始数据集
lst2 <- lapply(lst1, function(x) {
means <- colMeans(x[!x$Names %in% c("Baseline", "AASHTO"),
c('A', 'B')], na.rm = TRUE)
d1 <- rbind(x, data.frame(Names = "Mean", as.list(means)))
row.names(d1) <- d1$Names
d1[setdiff(names(d1), "Names")]
})
或使用tidyverse
library(dplyr)
library(purrr)
library(tibble)
map(lst1, ~ .x %>%
add_row(Names = 'Mean',
A = mean(.$A[!.$Names %in% c("Baseline", "AASHTO")],
na.rm = TRUE),
B = mean(.$B[!.$Names %in% c("Baseline", "AASHTO")], na.rm = TRUE)) %>%
`row.names<-`(., NULL) %>%
column_to_rownames('Names'))
lst1 <- list(structure(list(A = c(0.8262825, 1.01, 0.8262825, 1.01, 0.8262825,
1.01, 0.8262825, 1.01, 0.8262825, 1.01, 0.8262825, 1.01), B = c(0.734412,
0.734412, 0.734412, 0.734412, 0.734412, 0.734412, 0.734412, 0.734412,
0.734412, 0.734412, 0.734412, NA), Names = c("Baseline", "Sample1",
"Sample2", "Sample3", "Sample4", "Sample5", "Sample6", "Sample7",
"Sample8", "Sample9", "Sample10", "AASHTO")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12")), structure(list(
A = c(0.8262825, 1.01, 0.8262825, 1.01, 0.8262825, 1.01,
0.8262825, 1.01, 0.8262825, 1.01, 0.8262825, 1.01), B = c(0.734412,
0.734412, 0.734412, 0.734412, 0.734412, 0.734412, 0.734412,
0.734412, 0.734412, 0.734412, 0.734412, NA), Names = c("Baseline",
"Sample1", "Sample2", "Sample3", "Sample4", "Sample5", "Sample6",
"Sample7", "Sample8", "Sample9", "Sample10", "AASHTO")), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12")))