R:计算data.frame中因子的累积长度

时间:2016-03-29 19:30:13

标签: database cumulative-frequency

我有这个数据库:

Time = c("2016-03-01","2016-03-02","2016-03-03","2016-03-02","2016-03-03","2016-03-02")
match = c("a","b","c","a","b","c") 
names = c("julien","julien","julien", "mathieu","mathieu","simon") 
df = data.frame(Time, names, match) 
df = df[order(Time),]
df
        Time   names match
1 2016-03-01  julien     a
2 2016-03-02  julien     b
4 2016-03-02 mathieu     a
6 2016-03-02   simon     c
3 2016-03-03  julien     c
5 2016-03-03 mathieu     b

我希望每个玩家在一段时间内作为新列的累积匹配数量。我想知道,在任何时候,每位球员的比赛数量。像那样:

        Time   names match nb.of.match.played
1 2016-03-01  julien     a                  1
2 2016-03-02  julien     b                  2
4 2016-03-02 mathieu     a                  1
6 2016-03-02   simon     c                  1
3 2016-03-03  julien     c                  3
5 2016-03-03 mathieu     b                  2 

这似乎很容易,但我每次都尝试了一些结果失败的事情。 谢谢你的帮助!

1 个答案:

答案 0 :(得分:1)

我用趋势cumsum using ddply

解决了我的问题

但是我认为cumsum不适用于因素的长度,所以我有一个“1”列,其中cumsum可以工作。

Time = c("2016-03-01","2016-03-02","2016-03-03","2016-03-02","2016-03-03","2016-03-02")
match = c("a","b","c","a","b","c")
names = c("julien","julien","julien", "mathieu","mathieu","simon")
df = data.frame(Time, names, match) 
df = df[order(Time),]
df$nb = 1
df
        Time   names match nb
1 2016-03-01  julien     a  1
2 2016-03-02  julien     b  1
4 2016-03-02 mathieu     a  1
6 2016-03-02   simon     c  1
3 2016-03-03  julien     c  1
5 2016-03-03 mathieu     b  1

within(df, {
  nb.match <- ave(nb, names, FUN = cumsum)
})
df
        Time   names match nb nb.match
1 2016-03-01  julien     a  1        1
2 2016-03-02  julien     b  1        2
4 2016-03-02 mathieu     a  1        1
6 2016-03-02   simon     c  1        1
3 2016-03-03  julien     c  1        3
5 2016-03-03 mathieu     b  1        2