在dplyr中选择特定的行

时间:2018-10-11 14:00:09

标签: r dplyr row

样本数据:

dat <- structure(list(value = c(860L, 860L, 835L, 835L, 870L, 820L, 820L, 850L, 850L, 810L,
                                 852L, 840L, 840L, 825L, 825L, 900L, 900L, 830L,
                                 830L, 865L, 865L, 822L, 822L, 882L, 882L, 867L, 867L, 725L,
                                 725L, 727L, 727L, 874L, 874L), 
                  loc.id = c(12L, 13L, 12L, 13L, 12L, 12L, 13L, 12L, 13L, 12L,
                             13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 
                             13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L)), 
                  class = "data.frame", row.names = c(NA, -33L))

dat <- dat %>% dplyr::arrange(loc.id, value)

dat <- dat %>% dplyr::group_by(loc.id) %>% dplyr::mutate(length.val = n()) %>% dplyr::mutate(points.l = ceiling(length.val/4))

对于每个loc.id,我想选择4个应编入索引的行(以loc.id == 12为例):
1)第一行即行号,
2)第一行+ points.l,如果loc.12是第六行,
3)最后一行-points.l,在loc.12的情况下是第12行(17-5)
4)最后一行是行号17。类似:

  dat %>% group_by(loc.id) %>% 
            dplyr::filter(row_number() == 1st row,
                          row_number() == 1st row + points.l,
                          row_number() == last row - points.l,
                          row_number() == last row)

1 个答案:

答案 0 :(得分:0)

简单地:

 dat %>% group_by(loc.id) %>% filter(row_number() %in% c(1,1+points.l,n()-points.l,n()))