返回不同组的最后两行或前两行的平均值(由变量表示)

时间:2018-09-18 08:37:47

标签: r tidyverse

这是this question的后续活动。带有如下数据:

data <- structure(list(seq = c(1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 
4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 
7L, 7L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L), new_seq = c(2, 2, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
2, 2, 2, 2, NA, NA, NA, NA, NA, 4, 4, 4, 4, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 6, 6, 6, 6, 6, NA, NA, 8, 8, 8, NA, NA, NA), value = c(2L, 
0L, 0L, 3L, 0L, 5L, 5L, 3L, 0L, 3L, 2L, 3L, 2L, 3L, 4L, 1L, 0L, 
0L, 0L, 1L, 1L, 0L, 2L, 5L, 3L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 3L, 
5L, 3L, 1L, 1L, 1L, 0L, 1L, 0L, 4L, 3L, 0L, 3L, 1L, 3L, 0L, 0L, 
1L, 0L, 0L, 3L, 4L, 5L, 3L, 5L, 3L, 5L, 0L, 1L, 1L, 3L, 2L, 1L, 
0L, 0L, 0L, 0L, 5L, 1L, 1L, 0L, 4L, 1L, 5L, 0L, 3L, 1L, 2L, 1L, 
0L, 3L, 0L, 1L, 1L, 3L, 0L, 1L, 1L, 2L, 2L, 1L, 0L, 4L, 0L, 0L, 
3L, 0L, 0L)), row.names = c(NA, -100L), class = c("tbl_df", "tbl", 
"data.frame"))

对于每个new_seq而不是NA的值,我需要计算2中各个组的seq个观测值的平均值({{1}的值}表示值为new_seq)。问题是:

  • 对于那些行,其中seq指的是值new_seq,该值出现在(例如行seq行)应该是1:2 FIRST行的均值之后来自各个组,
  • 对于其中2指向值new_seq的行,该行之前应该是相应组中seq个LAST行的平均值

@ Z.Lin为第二种情况提供了出色的解决方案,但是如何调整它以处理两种情况?也许还有2的另一种解决方案?

1 个答案:

答案 0 :(得分:0)

我想我明白了,所以我为所有从搜索中来到这里的人发布答案。

lookup_backwards <- data %>%
  group_by(seq) %>%
  mutate(rank = seq(n(), 1)) %>% 
  filter(rank <= 2) %>%
  summarise(backwards = mean(value)) %>%
  ungroup()

lookup_forwards <- data %>% 
  group_by(seq) %>% 
  mutate(rank = seq(1, n())) %>% 
  filter(rank <= 2) %>% 
  summarise(forwards = mean(value)) %>% 
  ungroup()

data %>% 
  left_join(lookup_backwards, by = c('new_seq' = 'seq')) %>% 
  left_join(lookup_forwards, by = c('new_seq' = 'seq')) %>% 
  replace_na(list(backwards = 0, forwards = 0)) %>% 
  mutate(new_column = ifelse(new_seq > seq, forwards, backwards))