在R

时间:2018-03-12 18:44:34

标签: r

我已经学习了几天,我正在尝试做一个特定的输出,我无法弄清楚如何选择事件之前的几天。

我正在尝试确定补给事件如何影响水样中遏制的检测。我的数据有7列(日期,月,日,年,samp,prcip,雪),其中日期是YYYY / MM / DD,月,日和年是他们所说的,samp有0,1,或NA,沉淀和积雪都有每天的雨或雪总量。每天有1排。

我想在抽样活动之前探索不同天数的降雨事件。我想选择0(样本和无检测)或1(带检测的样本)的行,然后选择前几天,例如本例中的5,并比较那些先前的行的平均值,总和等0到1组之间的天数。

我已经找到了选择行的方法,并且计算了连续几天有0或1(How to subset consecutive rows if they meet a condition)的计数,但我无法弄清楚如何选择一些采样事件之前的几天,并创建一个新表。

我有超过800个采样日,超过20年,因此我不想输入每个日期(Subset dataframe where date is within x days of a vector of dates in R)。

我已尝试使用%>%管道和其他一些选择方法,并且已经能够选择包含0或1的行,但我不明白如何获取采样日之前的日期。我正在寻找任何方向,建议或功能/工具/包装,因为我已经找不到新的途径来探索。

我想选择我的数据,所以我最终可以运行一些简单的统计数据。这对我来说是一个探索性的项目 - 学习r和练习统计。我想进行t检验和ANOVA,看看采样前的不同日期。我正在选择样品日,所以我最终可以提出一个问题,例如“采样事件前5天的降雨如何影响检测结果”或“在阳性检测前5天降雨的平均值与负数前5天不同检测”。希望给你你的目的背景帮助我解释自己和我在寻找什么。

我的数据现在看起来如何:

date    month   day year    samp    precip  snow

11/11/1988 11 11 1988 NA 0 0 11/12/1988 11 12 1988 NA 0 0 11/13/1988 11 13 1988 NA 0.55 0 11/14/1988 11 14 1988 NA 0 0 11/15/1988 11 15 1988 NA 0 0 11/16/1988 11 16 1988 NA 0.52 0 11/17/1988 11 17 1988 NA 0 0 11/18/1988 11 18 1988 NA 0 0 11/19/1988 11 19 1988 NA 0 0 11/20/1988 11 20 1988 NA 0.39 0 11/21/1988 11 21 1988 NA 0.43 0 11/22/1988 11 22 1988 NA 0 0 11/23/1988 11 23 1988 NA 0 0 11/24/1988 11 24 1988 NA 0 0 11/25/1988 11 25 1988 NA 0 0 11/26/1988 11 26 1988 NA 0.11 0 11/27/1988 11 27 1988 NA 0.08 0 11/28/1988 11 28 1988 NA 0.01 0 11/29/1988 11 29 1988 NA 0 0 11/30/1988 11 30 1988 NA 0 0 12/1/1988 12 1 1988 NA 0 0 12/2/1988 12 2 1988 NA 0 0 12/3/1988 12 3 1988 NA 0 0 12/4/1988 12 4 1988 NA 0 0 12/5/1988 12 5 1988 NA 0 0 12/6/1988 12 6 1988 NA 0 0 12/7/1988 12 7 1988 NA 0 0 12/8/1988 12 8 1988 NA 0 0 12/9/1988 12 9 1988 NA 0 0 12/10/1988 12 10 1988 NA 0 0 12/11/1988 12 11 1988 NA 0 0 12/12/1988 12 12 1988 NA 0 0 12/13/1988 12 13 1988 NA 0.03 1 12/14/1988 12 14 1988 NA 0 0 12/15/1988 12 15 1988 NA 0 0 12/16/1988 12 16 1988 NA 0 0 12/17/1988 12 17 1988 NA 0 2 12/18/1988 12 18 1988 NA 0 0 12/19/1988 12 19 1988 NA 0 0 12/20/1988 12 20 1988 NA 0.07 0 12/21/1988 12 21 1988 NA 0.02 0 12/22/1988 12 22 1988 NA 0 0 12/23/1988 12 23 1988 NA 1.3 0 12/24/1988 12 24 1988 NA 0 0 12/25/1988 12 25 1988 NA 0 0 12/26/1988 12 26 1988 NA 0 0 12/27/1988 12 27 1988 NA 0.85 3 12/28/1988 12 28 1988 NA 0.37 3 12/29/1988 12 29 1988 NA 0 0 12/30/1988 12 30 1988 NA 0 0 12/31/1988 12 31 1988 NA 0 0 1/1/1989 1 1 1989 NA 0 0 1/2/1989 1 2 1989 NA 0 0 1/3/1989 1 3 1989 NA 0 0 1/4/1989 1 4 1989 NA 0 0 1/5/1989 1 5 1989 NA 0 0 1/6/1989 1 6 1989 NA 0.54 0 1/7/1989 1 7 1989 NA 0 0 1/8/1989 1 8 1989 NA 0.08 0 1/9/1989 1 9 1989 NA 0 0 1/10/1989 1 10 1989 NA 0 0 1/11/1989 1 11 1989 NA 0 0 1/12/1989 1 12 1989 NA 0 0 1/13/1989 1 13 1989 NA 0 0 1/14/1989 1 14 1989 NA 0 0 1/15/1989 1 15 1989 NA 0.04 1 1/16/1989 1 16 1989 NA 0 0 1/17/1989 1 17 1989 NA 0 0 1/18/1989 1 18 1989 NA 0 0 1/19/1989 1 19 1989 NA 0 0 1/20/1989 1 20 1989 NA 0 0 1/21/1989 1 21 1989 NA 0 0 1/22/1989 1 22 1989 NA 0 0 1/23/1989 1 23 1989 NA 0 0 1/24/1989 1 24 1989 NA 0 0 1/25/1989 1 25 1989 NA 0 0 1/26/1989 1 26 1989 NA 0.15 0 1/27/1989 1 27 1989 NA 0 0 1/28/1989 1 28 1989 NA 0 0 1/29/1989 1 29 1989 NA 0 0 1/30/1989 1 30 1989 NA 0 0 1/31/1989 1 31 1989 NA 0 0 2/1/1989 2 1 1989 NA 0 0 2/2/1989 2 2 1989 NA 0 0 2/3/1989 2 3 1989 NA 0.01 0 2/4/1989 2 4 1989 NA 0 0 2/5/1989 2 5 1989 NA 0.28 4 2/6/1989 2 6 1989 NA 0.21 3 2/7/1989 2 7 1989 NA 0 0 2/8/1989 2 8 1989 NA 0 0 2/9/1989 2 9 1989 NA 0 0 2/10/1989 2 10 1989 NA 0 0 2/11/1989 2 11 1989 NA 0 0 2/12/1989 2 12 1989 NA 0 0 2/13/1989 2 13 1989 NA 0.26 1 2/14/1989 2 14 1989 NA 0 0 2/15/1989 2 15 1989 NA 0.04 0 2/16/1989 2 16 1989 NA 0.03 1 2/17/1989 2 17 1989 NA 0 0 2/18/1989 2 18 1989 NA 0 0 2/19/1989 2 19 1989 NA 0 0 2/20/1989 2 20 1989 NA 0 0 2/21/1989 2 21 1989 NA 0.21 2 2/22/1989 2 22 1989 NA 0 0 2/23/1989 2 23 1989 NA 0 0 2/24/1989 2 24 1989 NA 0 0 2/25/1989 2 25 1989 NA 0 0 2/26/1989 2 26 1989 NA 0 0 2/27/1989 2 27 1989 NA 0 0 2/28/1989 2 28 1989 NA 0 0 3/1/1989 3 1 1989 1 0 0 3/2/1989 3 2 1989 NA 0 0 3/3/1989 3 3 1989 NA 0 0 3/4/1989 3 4 1989 NA 0 0 3/5/1989 3 5 1989 NA 0.34 0 3/6/1989 3 6 1989 NA 0 0 3/7/1989 3 7 1989 NA 0 0 3/8/1989 3 8 1989 NA 0 0 3/9/1989 3 9 1989 NA 0 0 3/10/1989 3 10 1989 NA 0 0 3/11/1989 3 11 1989 NA 0 0 3/12/1989 3 12 1989 NA 0 0 3/13/1989 3 13 1989 NA 0 0 3/14/1989 3 14 1989 NA 0 0 3/15/1989 3 15 1989 NA 0 0 3/16/1989 3 16 1989 NA 0 0 3/17/1989 3 17 1989 NA 0 0 3/18/1989 3 18 1989 NA 0.02 0 3/19/1989 3 19 1989 NA 0 0 3/20/1989 3 20 1989 NA 0 0 3/21/1989 3 21 1989 NA 0 0 3/22/1989 3 22 1989 NA 0 0 3/23/1989 3 23 1989 NA 0 0 3/24/1989 3 24 1989 NA 0 0 3/25/1989 3 25 1989 NA 0 0 3/26/1989 3 26 1989 NA 0 0 3/27/1989 3 27 1989 NA 0 0 3/28/1989 3 28 1989 NA 0.02 0 3/29/1989 3 29 1989 NA 0.81 0 3/30/1989 3 30 1989 NA 0 0 3/31/1989 3 31 1989 NA 0 0 4/1/1989 4 1 1989 NA 0 0 4/2/1989 4 2 1989 NA 0.05 0 4/3/1989 4 3 1989 NA 0.81 0 4/4/1989 4 4 1989 NA 0.49 0 4/5/1989 4 5 1989 NA 0 0 4/6/1989 4 6 1989 NA 0 0 4/7/1989 4 7 1989 NA 0 0 4/8/1989 4 8 1989 NA 0 0 4/9/1989 4 9 1989 NA 0.26 0 4/10/1989 4 10 1989 NA 0 0 4/11/1989 4 11 1989 NA 0 0 4/12/1989 4 12 1989 NA 0 0 4/13/1989 4 13 1989 NA 0 0 4/14/1989 4 14 1989 NA 0 0 4/15/1989 4 15 1989 NA 0 0 4/16/1989 4 16 1989 NA 0 0 4/17/1989 4 17 1989 NA 0.27 0 4/18/1989 4 18 1989 NA 0.04 0 4/19/1989 4 19 1989 NA 0 0 4/20/1989 4 20 1989 NA 0 0 4/21/1989 4 21 1989 NA 0 0 4/22/1989 4 22 1989 NA 0 0 4/23/1989 4 23 1989 NA 0 0 4/24/1989 4 24 1989 NA 0 0 4/25/1989 4 25 1989 NA 0 0 4/26/1989 4 26 1989 NA 0 0 4/27/1989 4 27 1989 NA 0.23 0 4/28/1989 4 28 1989 NA 0.28 0 4/29/1989 4 29 1989 NA 0 0 4/30/1989 4 30 1989 NA 0 0 5/1/1989 5 1 1989 NA 0 0 5/2/1989 5 2 1989 NA 0 0 5/3/1989 5 3 1989 0 0.28 0 5/4/1989 5 4 1989 NA 0 0 5/5/1989 5 5 1989 NA 0.06 0 5/6/1989 5 6 1989 NA 0 0 5/7/1989 5 7 1989 NA 0 0 5/8/1989 5 8 1989 NA 0 0 5/9/1989 5 9 1989 NA 0.42 0 5/10/1989 5 10 1989 NA 0.02 0 5/11/1989 5 11 1989 NA 0 0 5/12/1989 5 12 1989 NA 0 0 5/13/1989 5 13 1989 NA 0 0 5/14/1989 5 14 1989 NA 0 0 5/15/1989 5 15 1989 NA 0 0 5/16/1989 5 16 1989 NA 0 0 5/17/1989 5 17 1989 NA 0 0 5/18/1989 5 18 1989 NA 0 0 5/19/1989 5 19 1989 NA 0.05 0 5/20/1989 5 20 1989 NA 1.17 0 5/21/1989 5 21 1989 NA 0 0 5/22/1989 5 22 1989 NA 0 0 5/23/1989 5 23 1989 NA 0.03 0 5/24/1989 5 24 1989 NA 0 0 5/25/1989 5 25 1989 NA 0.21 0 5/26/1989 5 26 1989 NA 0.37 0 5/27/1989 5 27 1989 NA 0 0 5/28/1989 5 28 1989 NA 0 0 5/29/1989 5 29 1989 NA 0 0 5/30/1989 5 30 1989 NA 1.5 0 5/31/1989 5 31 1989 NA 0.14 0 6/1/1989 6 1 1989 1 0.97 0 6/2/1989 6 2 1989 NA 1.04 0 6/3/1989 6 3 1989 NA 0 0 6/4/1989 6 4 1989 NA 0.25 0 6/5/1989 6 5 1989 NA 0 0 6/6/1989 6 6 1989 NA 0 0 6/7/1989 6 7 1989 NA 0 0 6/8/1989 6 8 1989 NA 0 0 6/9/1989 6 9 1989 NA 0 0 6/10/1989 6 10 1989 NA 0 0 6/11/1989 6 11 1989 NA 0 0 6/12/1989 6 12 1989 NA 0.32 0 6/13/1989 6 13 1989 NA 0.16 0 6/14/1989 6 14 1989 NA 0 0

我希望我的数据在完成后看起来如何:

date month day year samp precip snow 2/24/1989 2 24 1989 NA 0 0 2/25/1989 2 25 1989 NA 0 0 2/26/1989 2 26 1989 NA 0 0 2/27/1989 2 27 1989 NA 0 0 2/28/1989 2 28 1989 NA 0 0 3/1/1989 3 1 1989 1 0 0 4/28/1989 4 28 1989 NA 0.28 0 4/29/1989 4 29 1989 NA 0 0 4/30/1989 4 30 1989 NA 0 0 5/1/1989 5 1 1989 NA 0 0 5/2/1989 5 2 1989 NA 0 0 5/3/1989 5 3 1989 0 0.28 0 5/27/1989 5 27 1989 NA 0 0 5/28/1989 5 28 1989 NA 0 0 5/29/1989 5 29 1989 NA 0 0 5/30/1989 5 30 1989 NA 1.5 0 5/31/1989 5 31 1989 NA 0.14 0 6/1/1989 6 1 1989 1 0.97 0

1 个答案:

答案 0 :(得分:0)

以下是使用which(!is.na())purrr::map的解决方案。这些是了解tidyversepurrr

的更多信息的良好来源
library(tidyverse)

str(dat)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    216 obs. of  7 variables:
#>  $ date  : chr  "11/11/1988" "11/12/1988" "11/13/1988" "11/14/1988" ...
#>  $ month : int  11 11 11 11 11 11 11 11 11 11 ...
#>  $ day   : int  11 12 13 14 15 16 17 18 19 20 ...
#>  $ year  : int  1988 1988 1988 1988 1988 1988 1988 1988 1988 1988 ...
#>  $ samp  : int  NA NA NA NA NA NA NA NA NA NA ...
#>  $ precip: num  0 0 0.55 0 0 0.52 0 0 0 0.39 ...
#>  $ snow  : int  0 0 0 0 0 0 0 0 0 0 ...

# Extra: convert date from character to date format
dat <- dat %>% 
  mutate(date = as.Date(date, "%m/%d/%Y"))

找到samp不是NA

的行位置
idx <- which(!is.na(dat$samp))
idx
#> [1] 111 174 203

接下来,我们遍历这些行索引,然后在它们之前5天提取值

idx %>% 
  map(. , function(x) dat[(x-5):(x), ])

#> [[1]]
#> # A tibble: 6 x 7
#>   date       month   day  year  samp precip  snow
#>   <date>     <int> <int> <int> <int>  <dbl> <int>
#> 1 1989-02-24     2    24  1989    NA     0.     0
#> 2 1989-02-25     2    25  1989    NA     0.     0
#> 3 1989-02-26     2    26  1989    NA     0.     0
#> 4 1989-02-27     2    27  1989    NA     0.     0
#> 5 1989-02-28     2    28  1989    NA     0.     0
#> 6 1989-03-01     3     1  1989     1     0.     0
#> 
#> [[2]]
#> # A tibble: 6 x 7
#>   date       month   day  year  samp precip  snow
#>   <date>     <int> <int> <int> <int>  <dbl> <int>
#> 1 1989-04-28     4    28  1989    NA  0.280     0
#> 2 1989-04-29     4    29  1989    NA  0.        0
#> 3 1989-04-30     4    30  1989    NA  0.        0
#> 4 1989-05-01     5     1  1989    NA  0.        0
#> 5 1989-05-02     5     2  1989    NA  0.        0
#> 6 1989-05-03     5     3  1989     0  0.280     0
#> 
#> [[3]]
#> # A tibble: 6 x 7
#>   date       month   day  year  samp precip  snow
#>   <date>     <int> <int> <int> <int>  <dbl> <int>
#> 1 1989-05-27     5    27  1989    NA  0.        0
#> 2 1989-05-28     5    28  1989    NA  0.        0
#> 3 1989-05-29     5    29  1989    NA  0.        0
#> 4 1989-05-30     5    30  1989    NA  1.50      0
#> 5 1989-05-31     5    31  1989    NA  0.140     0
#> 6 1989-06-01     6     1  1989     1  0.970     0

如果我们想要数据框中的结果

idx %>% 
  map_df(. , function(x) dat[(x-5):(x), ])

#> # A tibble: 18 x 7
#>    date       month   day  year  samp precip  snow
#>    <date>     <int> <int> <int> <int>  <dbl> <int>
#>  1 1989-02-24     2    24  1989    NA  0.        0
#>  2 1989-02-25     2    25  1989    NA  0.        0
#>  3 1989-02-26     2    26  1989    NA  0.        0
#>  4 1989-02-27     2    27  1989    NA  0.        0
#>  5 1989-02-28     2    28  1989    NA  0.        0
#>  6 1989-03-01     3     1  1989     1  0.        0
#>  7 1989-04-28     4    28  1989    NA  0.280     0
#>  8 1989-04-29     4    29  1989    NA  0.        0
#>  9 1989-04-30     4    30  1989    NA  0.        0
#> 10 1989-05-01     5     1  1989    NA  0.        0
#> 11 1989-05-02     5     2  1989    NA  0.        0
#> 12 1989-05-03     5     3  1989     0  0.280     0
#> 13 1989-05-27     5    27  1989    NA  0.        0
#> 14 1989-05-28     5    28  1989    NA  0.        0
#> 15 1989-05-29     5    29  1989    NA  0.        0
#> 16 1989-05-30     5    30  1989    NA  1.50      0
#> 17 1989-05-31     5    31  1989    NA  0.140     0
#> 18 1989-06-01     6     1  1989     1  0.970     0

使用function(x) & x

替换"~" & "."的更紧凑的表单
idx %>% 
  map_df(~ dat[(. -5):(.), ])

reprex package(v0.2.0)创建于2018-03-12。