鉴于以下数据
year date wk name type holiday closed_day
2017 2017-11-27 48 NA NA 0 0
2017 2017-12-04 49 NA NA 0 0
2017 2017-12-11 50 NA NA 0 0
2017 2017-12-18 51 NA NA 0 0
2017 2017-12-25 52 Christmas closed 0 1
2017 2017-12-26 52 NA NA 0 0
2017 2017-12-31 52 NewYearsEve holiday 1 0
如何使用dplyr获取
year date wk holiday closed_day
2017 2017-11-27 48 0 0
2017 2017-12-04 49 0 0
2017 2017-12-11 50 0 0
2017 2017-12-18 51 0 0
2017 2017-12-25 52 1 1
请注意,我每周都不需要姓名或类型,如果一周中有假期或者一个closed_day(不是总和,只是布尔值)
答案 0 :(得分:2)
试试这个:
library(dplyr)
df %>%
group_by(wk) %>%
mutate(holiday = max(holiday) > 0,
closed_day = max(closed_day) > 0) %>%
distinct(wk, .keep_all = TRUE) %>%
select(year, date, wk, holiday, closed_day)
给出了:
# A tibble: 5 x 5
# Groups: wk [5]
year date wk holiday closed_day
<int> <date> <int> <lgl> <lgl>
1 2017 2017-11-27 48 FALSE FALSE
2 2017 2017-12-04 49 FALSE FALSE
3 2017 2017-12-11 50 FALSE FALSE
4 2017 2017-12-18 51 FALSE FALSE
5 2017 2017-12-25 52 TRUE TRUE
wk
holiday
和closed_day
变为逻辑。wk
值答案 1 :(得分:2)
如果您对所获得的year
和date
值有所了解,那么您可以使用:
library(dplyr)
df %>%
group_by(wk) %>%
summarize_at(vars(year, date, holiday, closed_day), funs(max(.)))
# # A tibble: 5 × 5
# wk year date holiday closed_day
# <int> <int> <date> <int> <int>
# 1 48 2017 2017-11-27 0 0
# 2 49 2017 2017-12-04 0 0
# 3 50 2017 2017-12-11 0 0
# 4 51 2017 2017-12-18 0 0
# 5 52 2017 2017-12-31 1 1
否则
df %>%
group_by(wk) %>%
summarize(year = year[1], date = date[1],
holiday = 1*any(holiday > 0),
closed_day = 1*any(closed_day > 0))
# # A tibble: 5 × 5
# wk year date holiday closed_day
# <int> <int> <date> <dbl> <dbl>
# 1 48 2017 2017-11-27 0 0
# 2 49 2017 2017-12-04 0 0
# 3 50 2017 2017-12-11 0 0
# 4 51 2017 2017-12-18 0 0
# 5 52 2017 2017-12-25 1 1
(我第二次对holiday
和closed_day
采用了稍微不同的方法,以防你有几周&#34;两个&#34;并且只需要> 0
逻辑。 ..在这种情况下,保持logical
而不是数字将是更清晰的代码/数据方式。)
答案 2 :(得分:2)
如果您对data.table方法感兴趣,我们可以这样做:
library(data.table)
setDT(df)[, .(date = date[1], holiday = any(holiday), closed = any(closed_day)),
by = .(year, wk)]
# year wk date holiday closed
# 1: 2017 48 2017-11-27 FALSE FALSE
# 2: 2017 49 2017-12-04 FALSE FALSE
# 3: 2017 50 2017-12-11 FALSE FALSE
# 4: 2017 51 2017-12-18 FALSE FALSE
# 5: 2017 52 2017-12-25 TRUE TRUE
请注意,我按年和周汇总数据,假设您希望每年每周都有单独的摘要。