我的数据的小代表:
Date <- as.Date(rep(c("2015-05-14", "2015-05-15","2015-05-16"),c(4,2,1)))
TEAM1 <- c("GSW","SAS","MIL","ATL","GSW","SAC","LAL")
TEAM2 <- c("HOU","MIN","NOP","LAL","SAS","TOR","GSW")
PCW_TEAM1 <- c(0.88,0.72,0.34,0.46,0.87,0.28,0.24)
PCW_TEAM2 <- c(0.67,0.31,0.52,0.23,0.74,0.48,0.90)
df <- data.frame(cbind(Date,TEAM1,TEAM2,PCW_TEAM1,PCW_TEAM2), stringsAsFactors=F)
df
Date TEAM1 TEAM2 PCW_TEAM1 PCW_TEAM2
1 16569 GSW HOU 0.88 0.67
2 16569 SAS MIN 0.72 0.31
3 16569 MIL NOP 0.34 0.52
4 16569 ATL LAL 0.46 0.23
5 16570 GSW SAS 0.87 0.74
6 16570 SAC TOR 0.28 0.48
7 16571 LAL GSW 0.24 0.9
想象一下这些是NBA赛季前7场比赛。在第一个日期(16569),有四场比赛,所以排名将超过8.但是,一旦我们添加下一个日期(16570),还有两个更多的比赛,只有两个新的球队,因为GSW和SAS已经在第一个日期。
我想根据最后一个可用日期的胜率,对独特球队进行排名。输出看起来像这样:
Date TEAM1 TEAM2 PCW_TEAM1 PCW_TEAM2 RANK_TEAM1 RANK_TEAM2
1 16569 GSW HOU 0.88 0.67 1 3
2 16569 SAS MIN 0.72 0.31 2 7
3 16569 MIL NOP 0.34 0.52 6 4
4 16569 ATL LAL 0.46 0.23 5 8
5 16570 GSW SAS 0.87 0.74 1 2
6 16570 SAC TOR 0.28 0.48 9 5
7 16571 LAL GSW 0.24 0.9 10 1
请注意,在第5行,GSW的Winning%为0.87并且排名为1.在第一行中,Winning%更高(0.88)但也是GSW。
在这个例子中有7个游戏和10个独特的团队。在真实数据上有30个独特的团队。
unique(c(TEAM1,TEAM2))
[1] "GSW" "SAS" "MIL" "ATL" "SAC" "LAL" "HOU" "MIN" "NOP" "TOR"
我想创建一个矢量,它可以为每个独特的团队收集最后一个可用的获胜点数,然后根据这些信息对团队进行排名,但不知道如何去做,或者这是最好的方法。< / p>
答案 0 :(得分:1)
TEAMs <- c(TEAM1,TEAM2)
teamsall <- unique(TEAMs)
PCWs <- c(PCW_TEAM1,PCW_TEAM2)
Dates <- c(Date,Date)
u = order(sapply(1:length(teamsall),function(x) {u=match(TEAMs,teamsall)==x;PCWs[u][which.max(Dates[u])]}),decreasing=T)
df$RANK1 = match(TEAM1,teamsall[u])
df$RANK2 = match(TEAM2,teamsall[u])
df
我认为这可能是其中一种方式。