我已经学习了几天,我正在尝试做一个特定的输出,我无法弄清楚如何选择事件之前的几天。
我正在尝试确定补给事件如何影响水样中遏制的检测。我的数据有7列(日期,月,日,年,samp,prcip,雪),其中日期是YYYY / MM / DD,月,日和年是他们所说的,samp有0,1,或NA,沉淀和积雪都有每天的雨或雪总量。每天有1排。
我想在抽样活动之前探索不同天数的降雨事件。我想选择0(样本和无检测)或1(带检测的样本)的行,然后选择前几天,例如本例中的5,并比较那些先前的行的平均值,总和等0到1组之间的天数。
我已经找到了选择行的方法,并且计算了连续几天有0或1(How to subset consecutive rows if they meet a condition)的计数,但我无法弄清楚如何选择一些采样事件之前的几天,并创建一个新表。
我有超过800个采样日,超过20年,因此我不想输入每个日期(Subset dataframe where date is within x days of a vector of dates in R)。
我已尝试使用%>%管道和其他一些选择方法,并且已经能够选择包含0或1的行,但我不明白如何获取采样日之前的日期。我正在寻找任何方向,建议或功能/工具/包装,因为我已经找不到新的途径来探索。
我想选择我的数据,所以我最终可以运行一些简单的统计数据。这对我来说是一个探索性的项目 - 学习r和练习统计。我想进行t检验和ANOVA,看看采样前的不同日期。我正在选择样品日,所以我最终可以提出一个问题,例如“采样事件前5天的降雨如何影响检测结果”或“在阳性检测前5天降雨的平均值与负数前5天不同检测”。希望给你你的目的背景帮助我解释自己和我在寻找什么。
我的数据现在看起来如何:
date month day year samp precip snow
11/11/1988 11 11 1988 NA 0 0
11/12/1988 11 12 1988 NA 0 0
11/13/1988 11 13 1988 NA 0.55 0
11/14/1988 11 14 1988 NA 0 0
11/15/1988 11 15 1988 NA 0 0
11/16/1988 11 16 1988 NA 0.52 0
11/17/1988 11 17 1988 NA 0 0
11/18/1988 11 18 1988 NA 0 0
11/19/1988 11 19 1988 NA 0 0
11/20/1988 11 20 1988 NA 0.39 0
11/21/1988 11 21 1988 NA 0.43 0
11/22/1988 11 22 1988 NA 0 0
11/23/1988 11 23 1988 NA 0 0
11/24/1988 11 24 1988 NA 0 0
11/25/1988 11 25 1988 NA 0 0
11/26/1988 11 26 1988 NA 0.11 0
11/27/1988 11 27 1988 NA 0.08 0
11/28/1988 11 28 1988 NA 0.01 0
11/29/1988 11 29 1988 NA 0 0
11/30/1988 11 30 1988 NA 0 0
12/1/1988 12 1 1988 NA 0 0
12/2/1988 12 2 1988 NA 0 0
12/3/1988 12 3 1988 NA 0 0
12/4/1988 12 4 1988 NA 0 0
12/5/1988 12 5 1988 NA 0 0
12/6/1988 12 6 1988 NA 0 0
12/7/1988 12 7 1988 NA 0 0
12/8/1988 12 8 1988 NA 0 0
12/9/1988 12 9 1988 NA 0 0
12/10/1988 12 10 1988 NA 0 0
12/11/1988 12 11 1988 NA 0 0
12/12/1988 12 12 1988 NA 0 0
12/13/1988 12 13 1988 NA 0.03 1
12/14/1988 12 14 1988 NA 0 0
12/15/1988 12 15 1988 NA 0 0
12/16/1988 12 16 1988 NA 0 0
12/17/1988 12 17 1988 NA 0 2
12/18/1988 12 18 1988 NA 0 0
12/19/1988 12 19 1988 NA 0 0
12/20/1988 12 20 1988 NA 0.07 0
12/21/1988 12 21 1988 NA 0.02 0
12/22/1988 12 22 1988 NA 0 0
12/23/1988 12 23 1988 NA 1.3 0
12/24/1988 12 24 1988 NA 0 0
12/25/1988 12 25 1988 NA 0 0
12/26/1988 12 26 1988 NA 0 0
12/27/1988 12 27 1988 NA 0.85 3
12/28/1988 12 28 1988 NA 0.37 3
12/29/1988 12 29 1988 NA 0 0
12/30/1988 12 30 1988 NA 0 0
12/31/1988 12 31 1988 NA 0 0
1/1/1989 1 1 1989 NA 0 0
1/2/1989 1 2 1989 NA 0 0
1/3/1989 1 3 1989 NA 0 0
1/4/1989 1 4 1989 NA 0 0
1/5/1989 1 5 1989 NA 0 0
1/6/1989 1 6 1989 NA 0.54 0
1/7/1989 1 7 1989 NA 0 0
1/8/1989 1 8 1989 NA 0.08 0
1/9/1989 1 9 1989 NA 0 0
1/10/1989 1 10 1989 NA 0 0
1/11/1989 1 11 1989 NA 0 0
1/12/1989 1 12 1989 NA 0 0
1/13/1989 1 13 1989 NA 0 0
1/14/1989 1 14 1989 NA 0 0
1/15/1989 1 15 1989 NA 0.04 1
1/16/1989 1 16 1989 NA 0 0
1/17/1989 1 17 1989 NA 0 0
1/18/1989 1 18 1989 NA 0 0
1/19/1989 1 19 1989 NA 0 0
1/20/1989 1 20 1989 NA 0 0
1/21/1989 1 21 1989 NA 0 0
1/22/1989 1 22 1989 NA 0 0
1/23/1989 1 23 1989 NA 0 0
1/24/1989 1 24 1989 NA 0 0
1/25/1989 1 25 1989 NA 0 0
1/26/1989 1 26 1989 NA 0.15 0
1/27/1989 1 27 1989 NA 0 0
1/28/1989 1 28 1989 NA 0 0
1/29/1989 1 29 1989 NA 0 0
1/30/1989 1 30 1989 NA 0 0
1/31/1989 1 31 1989 NA 0 0
2/1/1989 2 1 1989 NA 0 0
2/2/1989 2 2 1989 NA 0 0
2/3/1989 2 3 1989 NA 0.01 0
2/4/1989 2 4 1989 NA 0 0
2/5/1989 2 5 1989 NA 0.28 4
2/6/1989 2 6 1989 NA 0.21 3
2/7/1989 2 7 1989 NA 0 0
2/8/1989 2 8 1989 NA 0 0
2/9/1989 2 9 1989 NA 0 0
2/10/1989 2 10 1989 NA 0 0
2/11/1989 2 11 1989 NA 0 0
2/12/1989 2 12 1989 NA 0 0
2/13/1989 2 13 1989 NA 0.26 1
2/14/1989 2 14 1989 NA 0 0
2/15/1989 2 15 1989 NA 0.04 0
2/16/1989 2 16 1989 NA 0.03 1
2/17/1989 2 17 1989 NA 0 0
2/18/1989 2 18 1989 NA 0 0
2/19/1989 2 19 1989 NA 0 0
2/20/1989 2 20 1989 NA 0 0
2/21/1989 2 21 1989 NA 0.21 2
2/22/1989 2 22 1989 NA 0 0
2/23/1989 2 23 1989 NA 0 0
2/24/1989 2 24 1989 NA 0 0
2/25/1989 2 25 1989 NA 0 0
2/26/1989 2 26 1989 NA 0 0
2/27/1989 2 27 1989 NA 0 0
2/28/1989 2 28 1989 NA 0 0
3/1/1989 3 1 1989 1 0 0
3/2/1989 3 2 1989 NA 0 0
3/3/1989 3 3 1989 NA 0 0
3/4/1989 3 4 1989 NA 0 0
3/5/1989 3 5 1989 NA 0.34 0
3/6/1989 3 6 1989 NA 0 0
3/7/1989 3 7 1989 NA 0 0
3/8/1989 3 8 1989 NA 0 0
3/9/1989 3 9 1989 NA 0 0
3/10/1989 3 10 1989 NA 0 0
3/11/1989 3 11 1989 NA 0 0
3/12/1989 3 12 1989 NA 0 0
3/13/1989 3 13 1989 NA 0 0
3/14/1989 3 14 1989 NA 0 0
3/15/1989 3 15 1989 NA 0 0
3/16/1989 3 16 1989 NA 0 0
3/17/1989 3 17 1989 NA 0 0
3/18/1989 3 18 1989 NA 0.02 0
3/19/1989 3 19 1989 NA 0 0
3/20/1989 3 20 1989 NA 0 0
3/21/1989 3 21 1989 NA 0 0
3/22/1989 3 22 1989 NA 0 0
3/23/1989 3 23 1989 NA 0 0
3/24/1989 3 24 1989 NA 0 0
3/25/1989 3 25 1989 NA 0 0
3/26/1989 3 26 1989 NA 0 0
3/27/1989 3 27 1989 NA 0 0
3/28/1989 3 28 1989 NA 0.02 0
3/29/1989 3 29 1989 NA 0.81 0
3/30/1989 3 30 1989 NA 0 0
3/31/1989 3 31 1989 NA 0 0
4/1/1989 4 1 1989 NA 0 0
4/2/1989 4 2 1989 NA 0.05 0
4/3/1989 4 3 1989 NA 0.81 0
4/4/1989 4 4 1989 NA 0.49 0
4/5/1989 4 5 1989 NA 0 0
4/6/1989 4 6 1989 NA 0 0
4/7/1989 4 7 1989 NA 0 0
4/8/1989 4 8 1989 NA 0 0
4/9/1989 4 9 1989 NA 0.26 0
4/10/1989 4 10 1989 NA 0 0
4/11/1989 4 11 1989 NA 0 0
4/12/1989 4 12 1989 NA 0 0
4/13/1989 4 13 1989 NA 0 0
4/14/1989 4 14 1989 NA 0 0
4/15/1989 4 15 1989 NA 0 0
4/16/1989 4 16 1989 NA 0 0
4/17/1989 4 17 1989 NA 0.27 0
4/18/1989 4 18 1989 NA 0.04 0
4/19/1989 4 19 1989 NA 0 0
4/20/1989 4 20 1989 NA 0 0
4/21/1989 4 21 1989 NA 0 0
4/22/1989 4 22 1989 NA 0 0
4/23/1989 4 23 1989 NA 0 0
4/24/1989 4 24 1989 NA 0 0
4/25/1989 4 25 1989 NA 0 0
4/26/1989 4 26 1989 NA 0 0
4/27/1989 4 27 1989 NA 0.23 0
4/28/1989 4 28 1989 NA 0.28 0
4/29/1989 4 29 1989 NA 0 0
4/30/1989 4 30 1989 NA 0 0
5/1/1989 5 1 1989 NA 0 0
5/2/1989 5 2 1989 NA 0 0
5/3/1989 5 3 1989 0 0.28 0
5/4/1989 5 4 1989 NA 0 0
5/5/1989 5 5 1989 NA 0.06 0
5/6/1989 5 6 1989 NA 0 0
5/7/1989 5 7 1989 NA 0 0
5/8/1989 5 8 1989 NA 0 0
5/9/1989 5 9 1989 NA 0.42 0
5/10/1989 5 10 1989 NA 0.02 0
5/11/1989 5 11 1989 NA 0 0
5/12/1989 5 12 1989 NA 0 0
5/13/1989 5 13 1989 NA 0 0
5/14/1989 5 14 1989 NA 0 0
5/15/1989 5 15 1989 NA 0 0
5/16/1989 5 16 1989 NA 0 0
5/17/1989 5 17 1989 NA 0 0
5/18/1989 5 18 1989 NA 0 0
5/19/1989 5 19 1989 NA 0.05 0
5/20/1989 5 20 1989 NA 1.17 0
5/21/1989 5 21 1989 NA 0 0
5/22/1989 5 22 1989 NA 0 0
5/23/1989 5 23 1989 NA 0.03 0
5/24/1989 5 24 1989 NA 0 0
5/25/1989 5 25 1989 NA 0.21 0
5/26/1989 5 26 1989 NA 0.37 0
5/27/1989 5 27 1989 NA 0 0
5/28/1989 5 28 1989 NA 0 0
5/29/1989 5 29 1989 NA 0 0
5/30/1989 5 30 1989 NA 1.5 0
5/31/1989 5 31 1989 NA 0.14 0
6/1/1989 6 1 1989 1 0.97 0
6/2/1989 6 2 1989 NA 1.04 0
6/3/1989 6 3 1989 NA 0 0
6/4/1989 6 4 1989 NA 0.25 0
6/5/1989 6 5 1989 NA 0 0
6/6/1989 6 6 1989 NA 0 0
6/7/1989 6 7 1989 NA 0 0
6/8/1989 6 8 1989 NA 0 0
6/9/1989 6 9 1989 NA 0 0
6/10/1989 6 10 1989 NA 0 0
6/11/1989 6 11 1989 NA 0 0
6/12/1989 6 12 1989 NA 0.32 0
6/13/1989 6 13 1989 NA 0.16 0
6/14/1989 6 14 1989 NA 0 0
我希望我的数据在完成后看起来如何:
date month day year samp precip snow
2/24/1989 2 24 1989 NA 0 0
2/25/1989 2 25 1989 NA 0 0
2/26/1989 2 26 1989 NA 0 0
2/27/1989 2 27 1989 NA 0 0
2/28/1989 2 28 1989 NA 0 0
3/1/1989 3 1 1989 1 0 0
4/28/1989 4 28 1989 NA 0.28 0
4/29/1989 4 29 1989 NA 0 0
4/30/1989 4 30 1989 NA 0 0
5/1/1989 5 1 1989 NA 0 0
5/2/1989 5 2 1989 NA 0 0
5/3/1989 5 3 1989 0 0.28 0
5/27/1989 5 27 1989 NA 0 0
5/28/1989 5 28 1989 NA 0 0
5/29/1989 5 29 1989 NA 0 0
5/30/1989 5 30 1989 NA 1.5 0
5/31/1989 5 31 1989 NA 0.14 0
6/1/1989 6 1 1989 1 0.97 0
答案 0 :(得分:0)
以下是使用which(!is.na())
和purrr::map
的解决方案。这些是了解tidyverse
和purrr
。
library(tidyverse)
str(dat)
#> Classes 'tbl_df', 'tbl' and 'data.frame': 216 obs. of 7 variables:
#> $ date : chr "11/11/1988" "11/12/1988" "11/13/1988" "11/14/1988" ...
#> $ month : int 11 11 11 11 11 11 11 11 11 11 ...
#> $ day : int 11 12 13 14 15 16 17 18 19 20 ...
#> $ year : int 1988 1988 1988 1988 1988 1988 1988 1988 1988 1988 ...
#> $ samp : int NA NA NA NA NA NA NA NA NA NA ...
#> $ precip: num 0 0 0.55 0 0 0.52 0 0 0 0.39 ...
#> $ snow : int 0 0 0 0 0 0 0 0 0 0 ...
# Extra: convert date from character to date format
dat <- dat %>%
mutate(date = as.Date(date, "%m/%d/%Y"))
找到samp
不是NA
idx <- which(!is.na(dat$samp))
idx
#> [1] 111 174 203
接下来,我们遍历这些行索引,然后在它们之前5天提取值
idx %>%
map(. , function(x) dat[(x-5):(x), ])
#> [[1]]
#> # A tibble: 6 x 7
#> date month day year samp precip snow
#> <date> <int> <int> <int> <int> <dbl> <int>
#> 1 1989-02-24 2 24 1989 NA 0. 0
#> 2 1989-02-25 2 25 1989 NA 0. 0
#> 3 1989-02-26 2 26 1989 NA 0. 0
#> 4 1989-02-27 2 27 1989 NA 0. 0
#> 5 1989-02-28 2 28 1989 NA 0. 0
#> 6 1989-03-01 3 1 1989 1 0. 0
#>
#> [[2]]
#> # A tibble: 6 x 7
#> date month day year samp precip snow
#> <date> <int> <int> <int> <int> <dbl> <int>
#> 1 1989-04-28 4 28 1989 NA 0.280 0
#> 2 1989-04-29 4 29 1989 NA 0. 0
#> 3 1989-04-30 4 30 1989 NA 0. 0
#> 4 1989-05-01 5 1 1989 NA 0. 0
#> 5 1989-05-02 5 2 1989 NA 0. 0
#> 6 1989-05-03 5 3 1989 0 0.280 0
#>
#> [[3]]
#> # A tibble: 6 x 7
#> date month day year samp precip snow
#> <date> <int> <int> <int> <int> <dbl> <int>
#> 1 1989-05-27 5 27 1989 NA 0. 0
#> 2 1989-05-28 5 28 1989 NA 0. 0
#> 3 1989-05-29 5 29 1989 NA 0. 0
#> 4 1989-05-30 5 30 1989 NA 1.50 0
#> 5 1989-05-31 5 31 1989 NA 0.140 0
#> 6 1989-06-01 6 1 1989 1 0.970 0
如果我们想要数据框中的结果
idx %>%
map_df(. , function(x) dat[(x-5):(x), ])
#> # A tibble: 18 x 7
#> date month day year samp precip snow
#> <date> <int> <int> <int> <int> <dbl> <int>
#> 1 1989-02-24 2 24 1989 NA 0. 0
#> 2 1989-02-25 2 25 1989 NA 0. 0
#> 3 1989-02-26 2 26 1989 NA 0. 0
#> 4 1989-02-27 2 27 1989 NA 0. 0
#> 5 1989-02-28 2 28 1989 NA 0. 0
#> 6 1989-03-01 3 1 1989 1 0. 0
#> 7 1989-04-28 4 28 1989 NA 0.280 0
#> 8 1989-04-29 4 29 1989 NA 0. 0
#> 9 1989-04-30 4 30 1989 NA 0. 0
#> 10 1989-05-01 5 1 1989 NA 0. 0
#> 11 1989-05-02 5 2 1989 NA 0. 0
#> 12 1989-05-03 5 3 1989 0 0.280 0
#> 13 1989-05-27 5 27 1989 NA 0. 0
#> 14 1989-05-28 5 28 1989 NA 0. 0
#> 15 1989-05-29 5 29 1989 NA 0. 0
#> 16 1989-05-30 5 30 1989 NA 1.50 0
#> 17 1989-05-31 5 31 1989 NA 0.140 0
#> 18 1989-06-01 6 1 1989 1 0.970 0
使用function(x) & x
"~" & "."
的更紧凑的表单
idx %>%
map_df(~ dat[(. -5):(.), ])
由reprex package(v0.2.0)创建于2018-03-12。