在下面的数据框中,可以找到分组出现最大值的日期
df
Date Var Value
27/9/2019 A 56
28/9/2019 A 50
1/10/2019 B 90
2/10/2019 B 100
df1 Max Date Mean
A 56 27/9/2019 53
B 100 2/10/2019 95
答案 0 :(得分:1)
我们可以group_by
Var
,计算mean
的{{1}}并选择具有最大值的行。
Value
答案 1 :(得分:0)
也许有更好的方法可以做到这一点。因为您想要一个summary
可以将多个值缩减为一个值。您可以使用摘要表来*_join
的输出,如下所示:
filter
library(dplyr)
df1 <- df %>%
group_by(Var) %>%
filter(Value == max(Value)) %>%
select(df1=Var, Max=Value, Date)
df2 <-df %>%
group_by(Var) %>%
summarise_at(.vars = vars(Value),
.funs = c(mean="mean", sd="sd"))
df2 %>%
left_join(df1, by = "Var") %>%
select(Var, Value, Date, mean, sd)
# -------------------------------------------------------------------------
# # A tibble: 2 x 5
# Var Value Date mean sd
# <chr> <dbl> <chr> <dbl> <dbl>
# 1 A 56 27/9/2019 53 4.24
# 2 B 100 2/10/2019 95 7.07
希望这就是您想要的。
答案 2 :(得分:0)
Base R,拆分应用组合(已编辑):
# Create df, ensure date vec has appropriate type:
df <- data.frame(
Date = as.Date(c("27/9/2019", "28/9/2019", "1/10/2019", "2/10/2019"), "%d/%m/%y"),
Var = c("A", "A", "B", "B"),
Value = c(56, 50, 90, 100),
stringsAsFactors = F
)
# Split df by "Var" values:
split_applied_combined <- lapply(split(df, df$Var), function(x){
# Calculate the max date:
max_date <- x$Date[which(x$Value == max(x$Value))]
# Calculate the mean:
mean_val <- mean(x$Value)
# Calculate the std_dev:
sd_val <- sd(x$Value)
# Combine vects into df:
summarised_df <- data.frame(max_date, mean_val, sd_val)
}
)
# Combine list back into dataframe:
split_applied_combined <- do.call(rbind,
# Store df name as vect:
mapply(cbind,
"Var" = names(split_applied_combined),
split_applied_combined,
SIMPLIFY = FALSE))
Dplyr替代:
require("dplyr")
# Group by var, summarise data, store return object as a dataframe:
summarised_df <-
df %>%
group_by(Var) %>%
summarise(max_date_per_group = max(Date), mean_val_per_group = mean(Value), sd_per_group = sd(Value)) %>%
ungroup()