使用R来播放游戏中的阵容数据

时间:2017-01-30 15:18:15

标签: r dplyr

我正在玩游戏的篮球游戏,我想创建包含以后汇总列表的“阵容”列。这是一个示例数据:

        game_id team_id opp_team_id player_id period secs_remaining  action_type action_subtype
     <int>   <int>       <int>     <int>  <int>          <int>        <chr>          <chr>
1     1475       5           8       587      1            720 substitution             in
2     1475       5           8        66      1            720 substitution             in
3     1475       5           8       596      1            720 substitution             in
4     1475       5           8       206      1            720 substitution             in
5     1475       5           8       469      1            720 substitution             in
6     1475       8           5       940      1            720 substitution             in
7     1475       8           5       120      1            720 substitution             in
8     1475       8           5       124      1            720 substitution             in
9     1475       8           5      1040      1            720 substitution             in
10    1475       8           5       114      1            720 substitution             in
11    1475      NA          NA        NA      1            720         game          start
12    1475       5           8       469      1            719     jumpball            won
13    1475       8           5       114      1            718     jumpball           lost
14    1475       8           5       120      1            695        steal               
15    1475       5           8       469      1            695     turnover   ballhandling

尝试使用dplyr的mutate()和列表进行实验,但每次都在尝试死路一条。希望输出数据有望像一个新列(我将使用第1行到第5行作为示例):

   id    lineup
<int>    <list>
    1    <int [5]> --> contains (587, NULL, NULL, NULL, NULL)
    2    <int [5]> --> contains (587, 66, NULL, NULL, NULL)
    3    <int [5]> --> contains (587, 66, 596, NULL, NULL)
    4    <int [5]> --> contains (587, 66, 596, 206, NULL)
    5    <int [5]> --> contains (587, 66, 596, 206, 469)

我知道在列表中追加新元素的速度很慢,所以如果有更好的方法在R中处理这个,我很乐意接受任何建议。

重要的是它可以处理组合。 (一旦我总结它,矢量(1,2,3,4,5)应该与(2,3,4,5,1)相同)。

提前致谢

更新

这是一个不是游戏开始的附加示例

  game_id team_id opp_team_id player_id period secs_remaining  action_type action_subtype
    <int>   <int>       <int>     <int>  <int>          <int>        <chr>          <chr>
1    1475       8           5       124      1            369       foulon               
2    1475       5           8       206      1            369 substitution            out
3    1475       5           8       125      1            369 substitution             in
4    1475       8           5      1040      1            369 substitution            out
5    1475       8           5        73      1            369 substitution             in
6    1475       8           5       124      1            358          3pt          

这是该游戏之后的第一次替换。每个团队的阵容应该是:

对于第8队:列表(940,120,124,1040,114)

对于第5组:列表(587,66,596,206,46)

这是预期的输出数据(仅选择阵容列):

   id lineup
<int> <list>
    1 <int [5]> --> contains(940,120,124,1040,114) #This isn't a substitute
    2 <int [5]> --> (587,66,596,46) #This was the sub out for Team 5
    3 <int [5]> --> (587,66,596,46,125) #This was the sub in for Team 5
    4 <int [5]> --> (940,120,124,114) #This was the sub out for Team 8
    5 <int [5]> --> (940,120,124,114,73) #This was the sub in for Team 8
    6 <int [5]> --> (940,120,124,114,73) #This isn't a substitute

我的最新尝试:

dat %>%
#Initialize lineup column
mutate(lineup = NA) %>%
mutate(lineup = ifelse(
          #Check if it's the start of the game
          is.na(lag(game_id)) | lag(game_id) != game_id,
          player_id,
          #Check if it's a substitution
          ifelse(
            action_type == 'substitution',
            #Check if it's a sub in or a sub out
            ifelse(
              #Sub in
              action_subtype == 'in',
              "sub in",
              #Sub out
              "sub out"
            ),
            "not a sub"
          )
        ))

1 个答案:

答案 0 :(得分:0)

我无法使用mutate()找到一种方法来实现这一点,所以我只是选择了循环。如果有人在寻找答案,那就是答案:

calc_lineup <- function(df) {
  lineup <- setNames(list(NA,NA,NA), c("t1", "t1_lineup", "t2_lineup"))
  for (row in 1:nrow(df)) {
      if (df[row,]$checker == 'start') {
        #If Start of the Game, 
        lineup$t1 <- df[row,]$team_id
        lineup$t1_lineup <- df[row,]$player_id
        lineup$t2_lineup <- NA

      } else if (df[row,]$checker == 'sub in') {
        if(lineup$t1 == df[row,]$team_id) {
          lineup$t1_lineup %<>% c(df[row,]$player_id)
          lineup$t1_lineup = lineup$t1_lineup[!is.na(lineup$t1_lineup)]
        } else {
          lineup$t2_lineup %<>% c(df[row,]$player_id)
          lineup$t2_lineup = lineup$t2_lineup[!is.na(lineup$t2_lineup)]
        }
      } else if (df[row,]$checker == 'sub out') {
        if(lineup$t1 == df[row,]$team_id) {
          lineup$t1_lineup <- lineup$t1_lineup[lineup$t1_lineup != df[row,]$player_id]
        } else {
          lineup$t2_lineup <- lineup$t2_lineup[lineup$t2_lineup != df[row,]$player_id]
        }
      }
    df[row,]$t1_lineup <- list(lineup$t1_lineup)
    df[row,]$t2_lineup <- list(lineup$t2_lineup)       
  }
  return(df)
}