我有一个数据集(许多玩家都有玩家名称,玩家评分和评级发布日期。例如。
Player date overall_rating
Aaron Cresswell 4/21/2016 74
Aaron Cresswell 12/5/2014 71
Aaron Cresswell 11/7/2014 71
Aaron Cresswell 9/18/2014 70
Aaron Cresswell 5/2/2014 70
Aaron Cresswell 4/4/2014 70
Aaron Cresswell 3/14/2014 70
Aaron Cresswell 12/13/2013 70
Aaron Cresswell 11/8/2013 70
Aaron Cresswell 10/4/2013 69
Aaron Cresswell 9/20/2013 69
Aaron Cresswell 5/3/2013 69
Aaron Cresswell 3/22/2013 69
Aaron Cresswell 3/15/2013 69
Aaron Cresswell 2/22/2013 69
Aaron Cresswell 2/15/2013 69
Aaron Cresswell 8/31/2012 68
Aaron Cresswell 2/22/2012 65
Aaron Cresswell 8/30/2011 64
Aaron Cresswell 8/30/2010 54
Aaron Cresswell 2/22/2010 51
Aaron Cresswell 8/30/2009 52
Aaron Cresswell 2/22/2009 47
Aaron Cresswell 8/30/2008 53
Aaron Cresswell 2/22/2007 53
Aaron Doran 1/7/2016 65
Aaron Doran 10/9/2015 66
Aaron Doran 9/21/2015 66
Aaron Doran 12/12/2014 67
Aaron Doran 9/18/2014 68
Aaron Doran 4/18/2014 68
Aaron Doran 3/14/2014 68
Aaron Doran 1/31/2014 69
Aaron Doran 11/29/2013 70
Aaron Doran 9/20/2013 71
Aaron Doran 5/31/2013 70
Aaron Doran 4/26/2013 70
Aaron Doran 4/19/2013 70
Aaron Doran 4/5/2013 70
Aaron Doran 3/22/2013 69
Aaron Doran 3/8/2013 69
Aaron Doran 2/15/2013 69
Aaron Doran 8/31/2012 65
Aaron Doran 2/22/2012 65
Aaron Doran 8/30/2011 65
Aaron Doran 2/22/2011 67
Aaron Doran 8/30/2010 67
Aaron Doran 2/22/2010 65
Aaron Doran 8/30/2009 65
Aaron Doran 2/22/2009 59
Aaron Doran 2/22/2007 59
Aaron Hughes 12/24/2015 70
Aaron Hughes 9/21/2015 70
Aaron Hughes 5/8/2015 69
Aaron Hughes 4/10/2015 69
Aaron Hughes 3/20/2015 70
Aaron Hughes 9/18/2014 72
Aaron Hughes 1/31/2014 72
Aaron Hughes 1/17/2014 72
Aaron Hughes 9/20/2013 73
Aaron Hughes 5/10/2013 73
Aaron Hughes 4/26/2013 74
Aaron Hughes 3/22/2013 74
Aaron Hughes 3/8/2013 74
Aaron Hughes 2/15/2013 74
Aaron Hughes 8/31/2012 74
Aaron Hughes 2/22/2012 75
我的问题是:如何执行此操作:如果日期介于(例如2006年8月1日至2007年5月30日)之间,则在名为Season的新列中,它应显示为“2006/2007”。因为一个玩家可以在一个赛季中获得多个评分,我想为每个玩家保留每个赛季的最后一个评分。
答案 0 :(得分:0)
您可以使用lubridate:
library(lubridate)
library(data.table)
start_date<-ymd("2006/08/01")
end_date<-ymd("2007/05/30")
如果df
是您的初始数据框,则:
df$date<-dmy(df$date)#make sure you don't get NA
最后你可以通过以下方式添加季节:
df$Season <-ifelse(between(df$date,start_date,end_date),paste0(year(start_date),"/",year(end_date)),"")
>df
player date rating Season
1 player1 2006-09-12 a 2006/2007
2 player1 2007-08-01 b
3 player2 2007-07-03 c
对于更通用的解决方案(数据框包含多年):
player<-c("player1","player1","player2","player2","player1")
date<-c( "12/09/2006","01/08/2007","03/07/2007","25/05/2015","05/04/2016")
rating<-c("a","b","c","d","a")
df<-data.frame(player,date,rating)
df$date<-dmy(df$date)#make sure you don't get NA
#dynamic dates (based on years)
df$start_date<-ymd(paste0(year(df$date)-1,"/08/01"))
df$end_date<-ymd(paste0(year(df$date),"/05/30"))
df$Season <- ifelse(between(df$date,df$start_date,df$end_date),paste0(year(df$start_date),"/",year(df$end_date)),paste0(year(df$start_date)+1,"/",year(df$end_date)+1))
导致:
>df
player date rating start_date end_date Season
1 player1 2006-09-12 a 2005-08-01 2006-05-30 2006/2007
2 player1 2007-08-01 b 2006-08-01 2007-05-30 2007/2008
3 player2 2007-07-03 c 2006-08-01 2007-05-30 2007/2008
4 player2 2015-05-25 d 2014-08-01 2015-05-30 2014/2015
5 player1 2016-04-05 a 2015-08-01 2016-05-30 2015/2016
答案 1 :(得分:0)
以下是使用dplyr
和lubridate
执行此操作的方法。基本上,您想要创建Season列。如果评分的month
小于或等于5,您希望季节为year
- 1 / year
。否则,季节将为year
/ year
= + 1。然后,您可以group_by
玩家和季节,并选择slice(n())
library(dplyr);library(lubridate)
df%>%
mutate(date=as.Date(date,"%m/%d/%Y"),
Season=ifelse(month(date)<=5,paste(year(date)-1,year(date),sep="/"),
paste(year(date),year(date)+1,sep="/"))) %>%
arrange(date)%>%
group_by(Player,Season)%>%
slice(n())
Player date overall_rating Season
<chr> <date> <int> <chr>
1 Aaron Cresswell 2007-02-22 53 2006/2007
2 Aaron Cresswell 2009-02-22 47 2008/2009
3 Aaron Cresswell 2010-02-22 51 2009/2010
4 Aaron Cresswell 2010-08-30 54 2010/2011
5 Aaron Cresswell 2012-02-22 65 2011/2012
6 Aaron Cresswell 2013-05-03 69 2012/2013
7 Aaron Cresswell 2014-05-02 70 2013/2014
8 Aaron Cresswell 2014-12-05 71 2014/2015
9 Aaron Cresswell 2016-04-21 74 2015/2016
10 Aaron Doran 2007-02-22 59 2006/2007
11 Aaron Doran 2009-02-22 59 2008/2009
12 Aaron Doran 2010-02-22 65 2009/2010
13 Aaron Doran 2011-02-22 67 2010/2011
14 Aaron Doran 2012-02-22 65 2011/2012
15 Aaron Doran 2013-05-31 70 2012/2013
16 Aaron Doran 2014-04-18 68 2013/2014
17 Aaron Doran 2014-12-12 67 2014/2015
18 Aaron Doran 2016-01-07 65 2015/2016
19 Aaron Hughes 2012-02-22 75 2011/2012
20 Aaron Hughes 2013-05-10 73 2012/2013
21 Aaron Hughes 2014-01-31 72 2013/2014
22 Aaron Hughes 2015-05-08 69 2014/2015
23 Aaron Hughes 2015-12-24 70 2015/2016
数据强>
df <- read.table(text='Player date overall_rating
"Aaron Cresswell" 4/21/2016 74
"Aaron Cresswell" 12/5/2014 71
"Aaron Cresswell" 11/7/2014 71
"Aaron Cresswell" 9/18/2014 70
"Aaron Cresswell" 5/2/2014 70
"Aaron Cresswell" 4/4/2014 70
"Aaron Cresswell" 3/14/2014 70
"Aaron Cresswell" 12/13/2013 70
"Aaron Cresswell" 11/8/2013 70
"Aaron Cresswell" 10/4/2013 69
"Aaron Cresswell" 9/20/2013 69
"Aaron Cresswell" 5/3/2013 69
"Aaron Cresswell" 3/22/2013 69
"Aaron Cresswell" 3/15/2013 69
"Aaron Cresswell" 2/22/2013 69
"Aaron Cresswell" 2/15/2013 69
"Aaron Cresswell" 8/31/2012 68
"Aaron Cresswell" 2/22/2012 65
"Aaron Cresswell" 8/30/2011 64
"Aaron Cresswell" 8/30/2010 54
"Aaron Cresswell" 2/22/2010 51
"Aaron Cresswell" 8/30/2009 52
"Aaron Cresswell" 2/22/2009 47
"Aaron Cresswell" 8/30/2008 53
"Aaron Cresswell" 2/22/2007 53
"Aaron Doran" 1/7/2016 65
"Aaron Doran" 10/9/2015 66
"Aaron Doran" 9/21/2015 66
"Aaron Doran" 12/12/2014 67
"Aaron Doran" 9/18/2014 68
"Aaron Doran" 4/18/2014 68
"Aaron Doran" 3/14/2014 68
"Aaron Doran" 1/31/2014 69
"Aaron Doran" 11/29/2013 70
"Aaron Doran" 9/20/2013 71
"Aaron Doran" 5/31/2013 70
"Aaron Doran" 4/26/2013 70
"Aaron Doran" 4/19/2013 70
"Aaron Doran" 4/5/2013 70
"Aaron Doran" 3/22/2013 69
"Aaron Doran" 3/8/2013 69
"Aaron Doran" 2/15/2013 69
"Aaron Doran" 8/31/2012 65
"Aaron Doran" 2/22/2012 65
"Aaron Doran" 8/30/2011 65
"Aaron Doran" 2/22/2011 67
"Aaron Doran" 8/30/2010 67
"Aaron Doran" 2/22/2010 65
"Aaron Doran" 8/30/2009 65
"Aaron Doran" 2/22/2009 59
"Aaron Doran" 2/22/2007 59
"Aaron Hughes" 12/24/2015 70
"Aaron Hughes" 9/21/2015 70
"Aaron Hughes" 5/8/2015 69
"Aaron Hughes" 4/10/2015 69
"Aaron Hughes" 3/20/2015 70
"Aaron Hughes" 9/18/2014 72
"Aaron Hughes" 1/31/2014 72
"Aaron Hughes" 1/17/2014 72
"Aaron Hughes" 9/20/2013 73
"Aaron Hughes" 5/10/2013 73
"Aaron Hughes" 4/26/2013 74
"Aaron Hughes" 3/22/2013 74
"Aaron Hughes" 3/8/2013 74
"Aaron Hughes" 2/15/2013 74
"Aaron Hughes" 8/31/2012 74
"Aaron Hughes" 2/22/2012 75',header=TRUE,stringsAsFactors=FALSE)