在创建邻接矩阵之前需要设置临界值

时间:2019-05-30 03:10:39

标签: r adjacency-matrix

这只是我拥有的数据集的一小部分:

      Winner    Player 1    Player 2    Player 3
       Susan    Archie      Heck         Jay
       Archie   Brown       Susan        Jay
       Heck     Archie      Jay          Brown
       Jay      Brown       Archie       Susan
       Brown    Susan       Archie       Jay
       Archie   Brown       Susan        Heck
       Susan    Heck        Jay          Brown
       Jay      Heck        Susan        Brown
       Susan    Archie      Heck         Brown
       Lee      Susan       Jay          Heck
       Kyle     Heck        Jay          Susan

我使用以下代码将其转换为邻接矩阵:

   d = read.csv("res.csv")
   lvs <- sort(as.character(unique(unlist(d))))
   d[] <- lapply(d, factor, levels = lvs)
   res <- table(d[c("Player.1","Winner")]) + 
   table(d[c("Player.2","Winner")]) + 
   table(d[c("Player.3","Winner")])  
   diag(res) <- 0

我需要设置截止值。因此,唯一应包括在矩阵中的人是彼此进行至少2场比赛的球员。

输出应为邻接矩阵,只有彼此之间至少玩过两次的球员。因此,原始矩阵如下所示:

          Winner    Susan   Archie  Heck    Jay     Brown   Lee     Kyle
          Susan       0       2      0       2         1     1       1
          Archie      2       0      1       1         1     0       0
          Heck        3       1      0       1         0     1       1
          Jay         2       1      1       0         1     1       1
          Brown       2       2      1       2         0     0       0
          Lee         0       0      0       0         0     0       0
          Kyle        0       0      0       0         0     0       0

但是在淘汰只匹配一次的玩家之后,结果矩阵如下:

          Winner    Susan   Archie  Heck    Jay     Brown   Lee     Kyle
          Susan       0       2      0       2         1     0       0
          Archie      2       0      1       1         1     0       0
          Heck        3       1      0       1         0     0       0
          Jay         2       1      1       0         1     0       0
          Brown       2       2      0       2         0     0       0
          Lee         0       0      0       0         0     0       0
          Kyle        0       0      0       0         0     0       0

1 个答案:

答案 0 :(得分:1)

通过gather设置为“长”格式,我们可以更轻松地做到这一点

library(tidyverse)
out <- gather(d, key, val, -Winner) %>% 
          select(-key) %>%
          mutate(val = factor(val, levels = lvs)) %>% 
          table %>% 
          t

,然后将“ Player”行中的0列设置为0值

out[, names(which(!rowSums(out)))] <- 0

数据

d <- structure(list(Winner = structure(c(7L, 1L, 3L, 4L, 2L, 1L, 7L, 
4L, 7L, 6L, 5L), .Label = c("Archie", "Brown", "Heck", "Jay", 
"Kyle", "Lee", "Susan"), class = "factor"), Player1 = structure(c(1L, 
2L, 1L, 2L, 7L, 2L, 3L, 3L, 1L, 7L, 3L), .Label = c("Archie", 
"Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), class = "factor"), 
    Player2 = structure(c(3L, 7L, 4L, 1L, 1L, 7L, 4L, 7L, 3L, 
    4L, 4L), .Label = c("Archie", "Brown", "Heck", "Jay", "Kyle", 
    "Lee", "Susan"), class = "factor"), Player3 = structure(c(4L, 
    4L, 2L, 7L, 4L, 3L, 2L, 2L, 2L, 3L, 7L), .Label = c("Archie", 
    "Brown", "Heck", "Jay", "Kyle", "Lee", "Susan"), 
 class = "factor")), row.names = c(NA, 
-11L), class = "data.frame")