按组查找包括给定行及其以下的行中的最小值

时间:2018-10-12 19:24:32

标签: r dplyr tidyverse

我有一个数据框,需要按Mat分组,并在包括给定行和在给定行以下的行中,在每行中找到最小的value。代码显示测试数据和我执行此操作的代码。字段min_value是所需的输出。想知道是否有一种tidyverse的方式-

df <- structure(list(Mat = c("A", "A", "A", "A", "A", "A", "B", "B", 
"B", "B", "B", "B"), value = c(11L, 22L, 0L, 22L, 33L, 43L, 108L, 
152L, 0L, 486L, 706L, 830L)), .Names = c("Mat", "value"), class = "data.frame", row.names = c(NA, 
-12L))

group_by(df, Mat) %>%
  mutate(
    min_value = sapply(row_number(), function(a) min(value[a:length(value)]))
  ) %>%
  ungroup()

# A tibble: 12 x 3
   Mat   value min_value
   <chr> <int>     <int>
 1 A        11         0
 2 A        22         0
 3 A         0         0
 4 A        22        22
 5 A        33        33
 6 A        43        43
 7 B       108         0
 8 B       152         0
 9 B         0         0
10 B       486       486
11 B       706       706
12 B       830       830

1 个答案:

答案 0 :(得分:2)

你可以做...

df %>% group_by(Mat) %>% mutate(v = rev(cummin(rev(value))))

# or maybe more 'verse-like
df %>% group_by(Mat) %>% mutate(v = value %>% rev %>% cummin %>% rev)

# or ...
below_min = . %>% rev %>% cummin %>% rev
df %>% group_by(Mat) %>% mutate(v = below_min(value))

或者使用data.table(它本身会修改df)...

library(data.table)
setDT(df)

df[order(.N:1), v := cummin(value), by=Mat]

如果有人感兴趣,这里也有类似的问答: