有效过滤最后/最近一周的最佳方式是什么(基于它可能不是一整周的数据)。
library(lubridate)
library(dplyr)
df <- data.frame(dates =
c("2014-12-17","2014-12-18","2014-12-21","2014-12-25","2014-12-26",
"2015-05-17","2015-05-18","2015-05-21","2015-05-25","2015-05-26",
"2016-06-17","2016-06-18","2016-06-21","2016-06-25","2016-06-26"))
df <- df %>% mutate(dates = ymd(dates),
the.year = year(dates),
the.week = week(dates))
#Filter the last week (as may not be complete)
我可以提出像这样的解决方案
max.week <- df %>% filter(the.year == max(the.year)) %>%
filter(the.week == max(the.week)) %>%
group_by(the.year, the.week) %>%
summarise(count= n()) %>%
ungroup() %>%
mutate(max.week = paste(the.year, the.week,sep="-")) %>%
select(max.week) %>%
unlist(use.names = F)
df %>% filter(!paste(the.year, the.week, sep = "-") == max.week)
%>%
但必须有更简单的解决方案?
答案 0 :(得分:4)
我能想到的最短的dplyr方式是
filter(df, !{yw <- interaction(the.year, the.week)} %in% yw[which.max(dates)])
但你可能想要将其分解以获得更好的易读性:
df %>%
mutate(yearweek = paste(the.year, the.week, sep = "-")) %>%
filter(!yearweek %in% yearweek[which.max(dates)])
删除!
以达到相反的效果。
答案 1 :(得分:2)
group_indices
也可以提供帮助:
df %>%
filter(group_indices(., the.year, the.week) < max(group_indices(., the.year, the.week)))
也可以写成:
df %>% filter({id <- group_indices(., the.year, the.week)} < max(id))
或
df %>%
mutate(id = group_indices(., the.year, the.week)) %>%
filter(id < max(id))
答案 2 :(得分:1)
试试这个:
df %>% transform(yw= the.year *100 + the.week) %>% filter(yw != max(yw)) %>% select(-yw)
或者,如果您的数据按日期排序似乎是这样的:
df %>% filter(the.year !=last(the.year) | the.week !=last(the.week))
答案 3 :(得分:1)
使用dplyr
,
df %>%
arrange(dates) %>%
filter(the.week != last(the.week) | the.year != last(the.year))