计算R中的多个字符串,按这些字符串分组

时间:2017-11-13 05:34:43

标签: r

我有一个像这样的data.frame:

Points      Assists      Steals
Player A    Player B     Player B
Player B    Player C     Player A
player C    Player C     Player A

我正试图获得这样的data.frame输出:

           Points     Assists      Steals
Player A    1          0            2
Player B    1          1            1
Player C    1          2            0

如您所见,我希望它计算每个玩家在每个类别中出现的次数。我可以使用table()为一列执行此操作,但无法弄清楚如何为多列执行此操作。我该怎么做?

2 个答案:

答案 0 :(得分:1)

我们可以gather为长格式,countspread为宽格式

library(dplyr)
library(tidyr)
gather(df1) %>% 
     count(key, value) %>%
     spread(key, n, fill = 0)
# A tibble: 3 x 4
#     value Assists Points Steals
#*    <chr>   <dbl>  <dbl>  <dbl>
#1 Player A       0      1      2
#2 Player B       1      1      1
#3 Player C       2      1      0

或者我们将melttable

一起使用
library(reshape2)
table(melt(as.matrix(df1))[3:2])
#         Var2
#value      Points Assists Steals
#  Player A      1       0      2
#  Player B      1       1      1
#  Player C      1       2      0

数据

df1 <- structure(list(Points = c("Player A", "Player B", "Player C"), 
Assists = c("Player B", "Player C", "Player C"), Steals = c("Player B", 
"Player A", "Player A")), .Names = c("Points", "Assists", 
"Steals"), class = "data.frame", row.names = c(NA, -3L))

答案 1 :(得分:0)

以下是data.table包的akrun解决方案的详细信息。可以有一行,但我把它扩展到几个可以理解的步骤。

library(data.table)
DT <- setDT(df1)
DT[,id := 1:.N]
plouf <- melt(DT, id.var = c("id"))
dcast(plouf, value ~variable)

给出

      value Points Assists Steals
1: Player A      1       0      2
2: Player B      1       1      1
3: Player C      1       2      0