我正在玩游戏的篮球游戏,我想创建包含以后汇总列表的“阵容”列。这是一个示例数据:
game_id team_id opp_team_id player_id period secs_remaining action_type action_subtype
<int> <int> <int> <int> <int> <int> <chr> <chr>
1 1475 5 8 587 1 720 substitution in
2 1475 5 8 66 1 720 substitution in
3 1475 5 8 596 1 720 substitution in
4 1475 5 8 206 1 720 substitution in
5 1475 5 8 469 1 720 substitution in
6 1475 8 5 940 1 720 substitution in
7 1475 8 5 120 1 720 substitution in
8 1475 8 5 124 1 720 substitution in
9 1475 8 5 1040 1 720 substitution in
10 1475 8 5 114 1 720 substitution in
11 1475 NA NA NA 1 720 game start
12 1475 5 8 469 1 719 jumpball won
13 1475 8 5 114 1 718 jumpball lost
14 1475 8 5 120 1 695 steal
15 1475 5 8 469 1 695 turnover ballhandling
尝试使用dplyr的mutate()和列表进行实验,但每次都在尝试死路一条。希望输出数据有望像一个新列(我将使用第1行到第5行作为示例):
id lineup
<int> <list>
1 <int [5]> --> contains (587, NULL, NULL, NULL, NULL)
2 <int [5]> --> contains (587, 66, NULL, NULL, NULL)
3 <int [5]> --> contains (587, 66, 596, NULL, NULL)
4 <int [5]> --> contains (587, 66, 596, 206, NULL)
5 <int [5]> --> contains (587, 66, 596, 206, 469)
我知道在列表中追加新元素的速度很慢,所以如果有更好的方法在R中处理这个,我很乐意接受任何建议。
重要的是它可以处理组合。 (一旦我总结它,矢量(1,2,3,4,5)应该与(2,3,4,5,1)相同)。
提前致谢
这是一个不是游戏开始的附加示例
game_id team_id opp_team_id player_id period secs_remaining action_type action_subtype
<int> <int> <int> <int> <int> <int> <chr> <chr>
1 1475 8 5 124 1 369 foulon
2 1475 5 8 206 1 369 substitution out
3 1475 5 8 125 1 369 substitution in
4 1475 8 5 1040 1 369 substitution out
5 1475 8 5 73 1 369 substitution in
6 1475 8 5 124 1 358 3pt
这是该游戏之后的第一次替换。每个团队的阵容应该是:
对于第8队:列表(940,120,124,1040,114)
对于第5组:列表(587,66,596,206,46)
这是预期的输出数据(仅选择阵容列):
id lineup
<int> <list>
1 <int [5]> --> contains(940,120,124,1040,114) #This isn't a substitute
2 <int [5]> --> (587,66,596,46) #This was the sub out for Team 5
3 <int [5]> --> (587,66,596,46,125) #This was the sub in for Team 5
4 <int [5]> --> (940,120,124,114) #This was the sub out for Team 8
5 <int [5]> --> (940,120,124,114,73) #This was the sub in for Team 8
6 <int [5]> --> (940,120,124,114,73) #This isn't a substitute
我的最新尝试:
dat %>%
#Initialize lineup column
mutate(lineup = NA) %>%
mutate(lineup = ifelse(
#Check if it's the start of the game
is.na(lag(game_id)) | lag(game_id) != game_id,
player_id,
#Check if it's a substitution
ifelse(
action_type == 'substitution',
#Check if it's a sub in or a sub out
ifelse(
#Sub in
action_subtype == 'in',
"sub in",
#Sub out
"sub out"
),
"not a sub"
)
))
答案 0 :(得分:0)
我无法使用mutate()找到一种方法来实现这一点,所以我只是选择了循环。如果有人在寻找答案,那就是答案:
calc_lineup <- function(df) {
lineup <- setNames(list(NA,NA,NA), c("t1", "t1_lineup", "t2_lineup"))
for (row in 1:nrow(df)) {
if (df[row,]$checker == 'start') {
#If Start of the Game,
lineup$t1 <- df[row,]$team_id
lineup$t1_lineup <- df[row,]$player_id
lineup$t2_lineup <- NA
} else if (df[row,]$checker == 'sub in') {
if(lineup$t1 == df[row,]$team_id) {
lineup$t1_lineup %<>% c(df[row,]$player_id)
lineup$t1_lineup = lineup$t1_lineup[!is.na(lineup$t1_lineup)]
} else {
lineup$t2_lineup %<>% c(df[row,]$player_id)
lineup$t2_lineup = lineup$t2_lineup[!is.na(lineup$t2_lineup)]
}
} else if (df[row,]$checker == 'sub out') {
if(lineup$t1 == df[row,]$team_id) {
lineup$t1_lineup <- lineup$t1_lineup[lineup$t1_lineup != df[row,]$player_id]
} else {
lineup$t2_lineup <- lineup$t2_lineup[lineup$t2_lineup != df[row,]$player_id]
}
}
df[row,]$t1_lineup <- list(lineup$t1_lineup)
df[row,]$t2_lineup <- list(lineup$t2_lineup)
}
return(df)
}