R,对数据框中的唯一值和重复值进行编号

时间:2018-01-09 17:59:42

标签: r

我有一个按时间顺序显示国际象棋比赛和结果的数据框。我想添加一个新列,当您沿着数据框向下时,该列将为我提供特定个人的匹配编号。

 Player <- c("Joe", "Bill", "Chris","Bill","Joe","Mark")
 Opponent <- c("Bill", "Joe", "Bill","Chris","Mark","Joe")
 Outcome <- c("W", "L", "W", "L", "L", "W")
 GameNumber <- c(1,1,2,2,3,3)

 Results <- data.frame(Player, Opponent, Outcome, GameNumber)

Current表看起来像这样

 Player Opponent Outcome GameNumber
 Joe    Bill     W       1
 Bill   Joe      L       1
 Chris  Bill     W       2
 Bill   Chris    L       2
 Joe    Mark     L       3
 Mark   Joe      W       3 

但是我想添加一个新列,给出特定玩家的匹配号,即

 Player Opponent Outcome GameNumber PlayerMatchNumber
 Joe    Bill     W       1          1
 Bill   Joe      L       1          1
 Chris  Bill     W       2          1
 Bill   Chris    L       2          2
 Joe    Mark     L       3          2
 Mark   Joe      W       3          1 

这将是比尔在与克里斯的比赛中的第二场比赛,因为乔将在他的对阵马克比赛中。

3 个答案:

答案 0 :(得分:2)

您可以使用data.table执行此操作: -

library(data.table)
setDT(Results)
Results[, PlayerMatchNumber := 1:.N, by = Player]

你会得到输出: -

     Player Opponent Outcome GameNumber PlayerMatchNumber
1:    Joe     Bill       W          1                 1
2:   Bill      Joe       L          1                 1
3:  Chris     Bill       W          2                 1
4:   Bill    Chris       L          2                 2
5:    Joe     Mark       L          3                 2
6:   Mark      Joe       W          3                 1

答案 1 :(得分:0)

快速而肮脏的基础版

Results$PlayerMatchNumber <- rowSums(sapply(unique(Results$Player), function(x) 
  cumsum(x == Results$Player) * as.numeric(x == Results$Player)))


Results$PlayerMatchNumber
# 1 1 1 2 2 1

答案 2 :(得分:0)

使用dplyr可以:

library(dplyr)

Results %>% 
  group_by(Player) %>% 
  mutate(Number = seq_along(Player)) %>% 
  ungroup()

# # A tibble: 6 x 5
# Player Opponent Outcome GameNumber Number
# <fctr> <fctr>   <fctr>       <dbl>  <int>
# 1 Joe    Bill     W             1.00      1
# 2 Bill   Joe      L             1.00      1
# 3 Chris  Bill     W             2.00      1
# 4 Bill   Chris    L             2.00      2
# 5 Joe    Mark     L             3.00      2
# 6 Mark   Joe      W             3.00      1