我想使用group_by函数随机抽样具有多个条件的数据集:
output<-files %>% group_by(location, time) %>% sample_n(3)
但是,有没有办法在每个条件中指定样本?像这样的东西:
output<-files %>% group_by(location(c[1:2]), time(c[00:00:00-01:00:00])) %>% sample_n(3)
原始数据框:
Location Time
1 00:00:00
1 00:02:22
1 00:04:12
1 00:30:00
1 01:00:00
1 01:27:00
1 02:00:00
1 03:00:00
1 03:31:00
2 00:00:00
2 00:03:33
2 00:04:44
2 01:00:00
2 02:00:00
2 03:00:00
3 00:00:00
3 01:00:00
3 02:00:00
3 03:00:00
可能看起来像这样(为了简单起见,数据框有限):
Location Time
1 00:00:00
1 00:02:22
1 01:00:00
2 00:00:00
2 00:03:33
2 00:04:44
答案 0 :(得分:1)
也许这会有所帮助
library(chron)
library(dplyr)
df1 %>%
filter(times(Time) >= times('00:00:00') & times(Time) <= times('01:00:00')) %>%
#or use between
#filter(between(times(Time), times('00:00:00'), times('01:00:00'))) %>%
group_by(Location) %>%
filter(n() >=3) %>%
sample_n(3)