筛选每个组的前n天

时间:2019-03-06 04:38:21

标签: r dplyr

我想选择每组的前两天并将其命名为data3。

这是我的数据:

x <- c("1jan1960", "1jan1960", "2jan1960", "3jan1960",
       "1jan1960", "2jan1960", "3jan1960", "3jan1960","4jan1960", "4jan1960",
       "1jan1960", "2jan1960", "2jan1960", "3jan1960","3jan1960", "4jan1960", "5jan1960", "5jan1960","6jan1960",
       "1jan1960", "2jan1960", "3jan1960", "30jan1960",
       "1jan1960", "1jan1960", "2jan1960", "3jan1960","3jan1960", "4jan1960")
z <- as.Date(x, "%d%b%Y")
set.seed(0302)
data<- data.frame(id=c(1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5,5,5),
                  glucose=rnorm(29,100,5),
                  date=z)

data2<- data %>% group_by(id) 

1 个答案:

答案 0 :(得分:1)

我们可以filter最小和最小+ 1 date

library(dplyr)

data3 <- data %>%
  group_by(id) %>%
  filter(date %in% c(min(date), min(date) + 1))

data3

#      id glucose  date      
#   <dbl>   <dbl> <date>    
# 1     1   101.  1960-01-01
# 2     1   102.  1960-01-01
# 3     1    98.7 1960-01-02
# 4     2   103.  1960-01-01
# 5     2   105.  1960-01-02
# 6     3   103.  1960-01-01
# 7     3    92.6 1960-01-02
# 8     3    96.3 1960-01-02
# 9     4    96.4 1960-01-01
#10     4   102.  1960-01-02
#11     5   101.  1960-01-01
#12     5    95.7 1960-01-01
#13     5    94.5 1960-01-02

或者按照@NelsonGon的建议,使用top_n

data3 <- data %>% group_by(id) %>% top_n(-2, date)