在下面的数据集中,我想首先检查列U
和T
的哪些行具有相同的值。对于所有这些行,我想计算Mean
列的平均值,Min
列的最小值和Max
列的最大值。
如果具有相同值的列U
和T
的行是分开的data.frame()
,我可以轻松地做到这一点,但是对于这种情况,我首先需要提取所有这样的子{{1 }}中的data.frame()
,然后执行操作。
如果有人可以使用data.frame()
库更好的方法,请提出建议?
输入数据
R
预期产量
data <- structure(list(A = c(0.1, 0.1, 0.1, 0.1), B = c(NA, NA, NA, NA
), C = structure(c(1L, 1L, 1L, 1L), .Label = "Yes", class = "factor"),
U = c(11L, 11L, 11L, 11L), T = structure(c(1L, 1L, 1L, 1L
), .Label = "A", class = "factor"), P = structure(c(1L, 1L,
1L, 1L), .Label = "INT", class = "factor"), Q = 1:4, R = c(0L,
0L, 0L, 0L), S = c(1L, 1L, 1L, 1L), W = structure(c(1L, 1L,
1L, 1L), .Label = "A", class = "factor"), Mean = c(21.208,
21.22333333, 21.23666667, 21.174), Min = c(21.02, 21.01,
21.09, 21.02), Max = c(21.35, 21.39, 21.47, 21.36)), class = "data.frame", row.names = c(NA,
-4L))
答案 0 :(得分:1)
我们可以使用
library(tidyverse)
data %>%
group_by(U, T) %>%
mutate(Mean = mean(Mean), Min = min(Min), Max = max(Max))%>%
slice(1)
答案 1 :(得分:1)
nm = names(data)[!names(data) %in% c("Mean", "Min", "Max")]
do.call(rbind, lapply(split(data, paste(data$U, data$T)), function(x){
data.frame(x[1, nm], Mean = mean(x$Mean), Min = min(x$Min), Max = max(x$Max))
}))
# A B C U T P Q R S W Mean Min Max
#11 A 0.1 NA Yes 11 A INT 1 0 1 A 21.2105 21.01 21.47