我有一个包含每日流量数据的数据帧列表。
我想为列表中每个数据帧(分别对应于站点中的数据)估计每年六月至十一月的最大每日流量。
这是数据帧列表的外观:
这是我正在使用的代码:
#Peak mean daily flow summer and fall (June to November)
PeakflowSummerFall <- lapply(listDF,function(x){x %>% group_by(x %>% mutate(year = year(Date)))
%>% filter((x %>% mutate(month = month(Date)) >= 6) & (x %>% mutate(month = month(Date)) <= 11))
%>% summarise(max=max(DailyStreamflow, na.rm =TRUE))})
但我遇到此错误:
<error/dplyr_error>
Problem with `filter()` input `..1`.
x Input `..1` must be of size 1, not size 24601.
i Input `..1` is `&...`.
i The error occurred in group 1: Date = 1953-06-01, DailyStreamflow = 32, year = 1953.
Backtrace:
Run `rlang::last_trace()` to see the full context
对这个问题有什么解决办法吗?
答案 0 :(得分:0)
#### This should give provide you with enough
#### sample data for answerers to work with
install.packages('purrr')
library(purrr)
sample_dat <- listDF %>%
head %>%
map( ~ head(.x))
dput(sample_dat)
#### With that being said...
#### You should flatten the data frame...
#### It's easier to work with...
install.packages('lubridate')
library(lubridate)
listDF %>%
plyr::ldply(rbind) %>%
mutate(month = floor_date(Date, unit = 'month')) %>%
filter(month(Date) > 5, month(Date) < 12) %>%
group_by(.id, month) %>%
dplyr::summarise(max_flow = max(DailyStreamflow)) %>%
split(.$.id)
答案 1 :(得分:0)
鉴于已发布的数据结构图片,以下方法可能有效。
library(lubridate)
library(dplyr)
listDF %>%
purrr::map(function(x){
x %>%
filter(month(Date) >= 6 & month(Date) <= 11) %>%
group_by(year(Date)) %>%
summarise(Max = max(DailyStreamflow, na.rm = TRUE), .groups = "keep")
})
测试数据创建代码。
fun <- function(year, n){
d1 <- as.Date(paste(year, 1, 1, sep = "-"))
d2 <- as.Date(paste(year + 10, 12, 31, sep = "-"))
d <- seq(d1, d2, by = "day")
d <- sort(rep(sample(d, n, TRUE), length.out = n))
flow <- sample(10*n, n, TRUE)
data.frame(Date = d, DailyStreamflow = flow)
}
set.seed(2020)
listDF <- lapply(1:3, function(i) fun(c(1953, 1965, 1980)[i], c(24601, 13270, 17761)[i]))
str(listDF)
rm(fun)