样本数据:
dat <- structure(list(value = c(860L, 860L, 835L, 835L, 870L, 820L, 820L, 850L, 850L, 810L,
852L, 840L, 840L, 825L, 825L, 900L, 900L, 830L,
830L, 865L, 865L, 822L, 822L, 882L, 882L, 867L, 867L, 725L,
725L, 727L, 727L, 874L, 874L),
loc.id = c(12L, 13L, 12L, 13L, 12L, 12L, 13L, 12L, 13L, 12L,
13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L,
13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L, 12L, 13L)),
class = "data.frame", row.names = c(NA, -33L))
dat <- dat %>% dplyr::arrange(loc.id, value)
dat <- dat %>% dplyr::group_by(loc.id) %>% dplyr::mutate(length.val = n()) %>% dplyr::mutate(points.l = ceiling(length.val/4))
对于每个loc.id,我想选择4个应编入索引的行(以loc.id == 12为例):
1)第一行即行号,
2)第一行+ points.l,如果loc.12是第六行,
3)最后一行-points.l,在loc.12的情况下是第12行(17-5)
4)最后一行是行号17。类似:
dat %>% group_by(loc.id) %>%
dplyr::filter(row_number() == 1st row,
row_number() == 1st row + points.l,
row_number() == last row - points.l,
row_number() == last row)
答案 0 :(得分:0)
简单地:
dat %>% group_by(loc.id) %>% filter(row_number() %in% c(1,1+points.l,n()-points.l,n()))