我有这个数据框,涵盖了1960 - 1980年的数据。
dput->structure(list(DATE = 19620101:19620106, PRECIP = c(10.54, 6.39,
0.01, 0, 0.02, 20.94), OBS_Q = c(2.39, 2.38, 2.22, 2.24, 2.26,
5.13)), .Names = c("DATE", "PRECIP", "OBS_Q"), row.names = c(NA,
6L), class = "data.frame")
我想做的就是:
预期输出(比如n = 2,日期= 19620103)
19620101 10.54 2.39
19620102 6.39 2.38
19630101 11.54 3.39
19630102 62.39 3.38
19640101 12.54 4.39
19640102 6.39 5.38
*
*
19800101 12.12 3.44
19800102 12.33 3.45
我不知道如何继续这样做。欢迎提出任何建议。
答案 0 :(得分:1)
所以这是一种非优雅的方式。我们的想法是只检查月份和日期(并忽略输入的年份)。如果有必要,可以很容易地在这些年中纳入任何条件。首先是:
library(dplyr) # data manipulation
library(lubridate) # time and dates manipulation
df <- data.frame(DATE = c(19620101:19620106,19630101:19630106),
PRECIP = c(10.54, 6.39, 0.01, 0, 0.02, 20.94,10.54, 6.39, 0.01, 0, 0.02, 20.94),
OBS_Q = c(2.39, 2.38, 2.22, 2.24, 2.26, 5.13,2.39, 2.38, 2.22, 2.24, 2.26, 5.13))
# Here you actually specify what days to select. Only the "0106" part matters here
day_in_a_year <- paste0("1962", "0106")
days_shown <- 2 # how many days per year to show
# so, in this case, select 6th January and the day before
df %>% mutate(DATE = ymd(DATE)) %>%
arrange(DATE) %>%
filter(between(day(DATE), day(ymd(day_in_a_year) - days(days_shown - 1)), day(ymd(day_in_a_year))),
between(month(DATE), month(ymd(day_in_a_year) - days(days_shown - 1)), month(ymd(day_in_a_year))))
# DATE PRECIP OBS_Q
# 1 1962-01-05 0.02 2.26
# 2 1962-01-06 20.94 5.13
# 3 1963-01-05 0.02 2.26
# 4 1963-01-06 20.94 5.13
编辑:
由于您希望选择输入日期之前(包括)年份,您可以使用以下内容:
df %>% mutate(DATE = ymd(DATE)) %>%
arrange(DATE) %>%
filter(between(day(DATE), day(ymd(day_in_a_year) - days(days_shown - 1)), day(ymd(day_in_a_year))),
between(month(DATE), month(ymd(day_in_a_year) - days(days_shown - 1)), month(ymd(day_in_a_year))),
year(DATE) <= year(ymd(day_in_a_year)))