为了跟上球队的最新状态,我想知道他们在最近的N场比赛中的表现。数据开始像这样:
HomeTeam AwayTeam Winner
Liverpool Chelsea Home
Arsenal Liverpool Away
Manchester Liverpool TBA
例如,我想在第3场比赛开始前的最近2场比赛中知道表格,结果数据帧应如下所示:
HomeTeam AwayTeam Winner HomeForm AwayForm
LiverPool Chelsea Home NA NA
Arsenal Liverpool Away 0 1
Manchester Liverpool TBA 0 2
我同时研究了LAG和IF / ELSE函数,但似乎找不到能够动态查找结果的解决方案。
答案 0 :(得分:2)
可能有一个更简单的破解方法,但是您可以尝试:
library(tidyverse)
library(zoo)
last_n_games <- 2
df <- df %>% rowid_to_column
Forms <- df %>%
mutate(Winner = case_when(Winner == "Home" ~ HomeTeam,
Winner == "Away" ~ AwayTeam,
TRUE ~ "TBA")
) %>%
gather(Team, name, HomeTeam:AwayTeam) %>%
distinct(rowid, name, Winner) %>%
group_by(name) %>%
arrange(rowid) %>%
mutate(
HomeForm = +(Winner == name),
HomeForm = rollapply(HomeForm, width = list(-(1:last_n_games)), sum,
partial = TRUE, fill = NA, align = "right"),
AwayForm = HomeForm
) %>%
mutate_at(vars(contains("Form")), funs(ifelse(rowid != 1 & is.na(.), 0, .))) %>%
distinct(rowid, name, HomeForm, AwayForm)
df %>%
left_join(Forms %>% select(-AwayForm), by = c("rowid", "HomeTeam" = "name")) %>%
left_join(Forms %>% select(-HomeForm), by = c("rowid", "AwayTeam" = "name")) %>%
select(-rowid)
输出:
HomeTeam AwayTeam Winner HomeForm AwayForm
1 Liverpool Chelsea Home NA NA
2 Arsenal Liverpool Away 0 1
3 Manchester Liverpool TBA 0 2
哦,我忘了-这是假设您的数据框没有拼写错误(您有时确实将Liverpool
设置为LiverPool
吗?)。
如果这不仅是拼写错误,请告诉我们,我会改写代码。