我试图计算两个日期之间的降水数量低于某个阈值(让我们说小于或等于50)。
基本上,我有一个向量cuts
,其中包含我想要包含的日期。我想将cuts
向量用于"子集"不同垃圾箱中的数据集,并计算下雨不到50毫米的事件数量。
我现在正在使用dplyr和for循环,但没有任何工作。
set.seed(12345)
df = data.frame(date = seq(as.Date("2000/03/01"), as.Date("2002/03/01"), "days"),
precipitation = rnorm(length(seq(as.Date("2000/03/01"), as.Date("2002/03/01"), "days")),80,20))
cuts = c("2001-11-25","2002-01-01","2002-02-18","2002-03-01")
for (i in 1:length(cuts)) {
df %>% summarise(count.prec = if (date > cuts[i] | date < cuts[i+1]) {count(precipitation <= 50)})
}
但是我有这样的错误信息:
Error: no applicable method for 'group_by_' applied to an object of class "logical"
In addition: Warning message:
In if (c(11017, 11018, 11019, 11020, 11021, 11022, 11023, 11024, :
the condition has length > 1 and only the first element will be used
这也不起作用:
for (i in 1:length(cuts)) {
df %>% if (date > cuts[i] | date < cuts[i+1])%>% summarise(count.prec = count(precipitation <= 50))
}
答案 0 :(得分:4)
你可以尝试:
df %>%
group_by(gr = cut(date, breaks = as.Date(cuts))) %>%
summarise(res = sum(precipitation <= 50))
给出了:
# A tibble: 4 × 2
gr res
<fctr> <int>
1 2001-11-25 1
2 2002-01-01 4
3 2002-02-18 2
4 NA 40
或者按照@Frank的提及 - 您可以summarise()
替换tally(precipitation <= 50)
答案 1 :(得分:1)
我们可以尝试使用data.table
library(data.table)#v1.9.7+
df2 <- data.table(cuts1 = as.Date(cuts[-length(cuts)]), cuts2 = as.Date(cuts[-1]))
setDT(df)[df2, .(Count = sum(precipitation <=50)),
on = .(date > cuts1, date < cuts2), by = .EACHI]
# date date Count
#1: 2001-11-25 2002-01-01 1
#2: 2002-01-01 2002-02-18 4
#3: 2002-02-18 2002-03-01 2