查找发生最大值的日期

时间:2019-10-13 03:06:32

标签: r

在下面的数据框中,可以找到分组出现最大值的日期

df
Date        Var    Value
27/9/2019    A       56
28/9/2019    A       50
1/10/2019    B       90
2/10/2019    B       100

df1      Max         Date          Mean
A        56        27/9/2019        53
B        100       2/10/2019        95

3 个答案:

答案 0 :(得分:1)

我们可以group_by Var,计算mean的{​​{1}}并选择具有最大值的行。

Value

答案 1 :(得分:0)

也许有更好的方法可以做到这一点。因为您想要一个summary可以将多个值缩减为一个值。您可以使用摘要表来*_join的输出,如下所示:

filter

数据

library(dplyr)
df1 <- df %>%
  group_by(Var) %>%
  filter(Value == max(Value)) %>%
  select(df1=Var, Max=Value, Date)

df2 <-df %>%
  group_by(Var) %>%
  summarise_at(.vars = vars(Value),
               .funs = c(mean="mean", sd="sd")) 
df2 %>%
  left_join(df1, by = "Var") %>%
  select(Var, Value, Date, mean, sd)

# -------------------------------------------------------------------------

# # A tibble: 2 x 5
#   Var   Value Date       mean    sd
#   <chr> <dbl> <chr>     <dbl> <dbl>
# 1 A        56 27/9/2019    53  4.24
# 2 B       100 2/10/2019    95  7.07

希望这就是您想要的。

答案 2 :(得分:0)

Base R,拆分应用组合(已编辑):

# Create df, ensure date vec has appropriate type: 

df <- data.frame(

  Date = as.Date(c("27/9/2019", "28/9/2019", "1/10/2019", "2/10/2019"), "%d/%m/%y"),

  Var = c("A", "A", "B", "B"), 

  Value = c(56, 50, 90, 100),

  stringsAsFactors = F
)

# Split df by "Var" values: 

split_applied_combined <- lapply(split(df, df$Var), function(x){

# Calculate the max date: 

    max_date <- x$Date[which(x$Value == max(x$Value))]

    # Calculate the mean: 

    mean_val <- mean(x$Value)

    # Calculate the std_dev: 

    sd_val <- sd(x$Value)

   # Combine vects into df: 

    summarised_df <- data.frame(max_date, mean_val, sd_val)

    }
  )

# Combine list back into dataframe:

split_applied_combined <- do.call(rbind, 

                          # Store df name as vect:

                                  mapply(cbind,

                                         "Var" = names(split_applied_combined),

                                         split_applied_combined,

                                         SIMPLIFY = FALSE))

Dplyr替代:

require("dplyr")

# Group by var, summarise data, store return object as a dataframe: 

summarised_df <- 

  df %>% 

  group_by(Var) %>% 

  summarise(max_date_per_group = max(Date), mean_val_per_group = mean(Value), sd_per_group = sd(Value)) %>% 

  ungroup()