如何根据矩阵位置计算总分?

时间:2019-11-21 18:28:31

标签: r matrix rank

我有一个数据框,其中包含12个列,其中有不同的参与者,位于前5名中。它看起来像这样:

> top_5
     4         5         8         9          11         12         15         16         19         20        22         23       
[1,] "Nia"     "Hung"    "Hanaaa"  "Ramziyya" "Marissa"  "Jaelyn"   "Shyanne"  "Jaabir"   "Dionicio" "Nia"     "Shyanne"  "Roger"  
[2,] "Razeena" "Husni"   "Bradly"  "Marissa"  "Bradly"   "Muhsin"   "Razeena"  "Dionicio" "Magnus"   "Kelsey"  "Nia"      "Schyler"
[3,] "Shyanne" "Schyler" "Necko"   "Johannah" "Tatiana"  "Glenn"    "Nia"      "Jaelyn"   "Shyanne"  "Hanaaa"  "Mildred"  "German" 
[4,] "Schyler" "German"  "Hung"    "Lubaaba"  "Johannah" "Magnus"   "Dionicio" "German"   "German"   "Razeena" "Dionicio" "Jaabir" 
[5,] "Husni"   "Necko"   "Razeena" "Afeefa"   "Schyler"  "Dionicio" "Jaabir"   "Roger"    "Johannah" "Remy"    "Jaabir"   "Jaelyn" 

(并且可以使用此方法重新创建):

structure(c("Nia", "Razeena", "Shyanne", "Schyler", "Husni", 
"Hung", "Husni", "Schyler", "German", "Necko", "Hanaaa", "Bradly", 
"Necko", "Hung", "Razeena", "Ramziyya", "Marissa", "Johannah", 
"Lubaaba", "Afeefa", "Marissa", "Bradly", "Tatiana", "Johannah", 
"Schyler", "Jaelyn", "Muhsin", "Glenn", "Magnus", "Dionicio", 
"Shyanne", "Razeena", "Nia", "Dionicio", "Jaabir", "Jaabir", 
"Dionicio", "Jaelyn", "German", "Roger", "Dionicio", "Magnus", 
"Shyanne", "German", "Johannah", "Nia", "Kelsey", "Hanaaa", "Razeena", 
"Remy", "Shyanne", "Nia", "Mildred", "Dionicio", "Jaabir", "Roger", 
"Schyler", "German", "Jaabir", "Jaelyn"), .Dim = c(5L, 12L), .Dimnames = list(
    NULL, c("4", "5", "8", "9", "11", "12", "15", "16", "19", 
    "20", "22", "23")))

现在,如果参与者在第一行,则表示他们在该列中排名第一(因此对于第一列,“ Nia”是第一列,“ Razeena”是第二列,依此类推)。排名中的第一名是5分,第二名是4分,依此类推。现在,我想为矩阵中的每个参与者计算她/他的分。
我的目标是使总体排名前5位。我该怎么做?

3 个答案:

答案 0 :(得分:3)

一种选择是split将与矩阵值相反的行索引转换为list,并通过循环{{1}获得每个sum元素的list }}(list

sapply

或者另一个选择是out <- sapply(split(row(top_5)[nrow(top_5):1, ], top_5), sum) out #Afeefa Bradly Dionicio German Glenn Hanaaa Hung Husni Jaabir Jaelyn Johannah Kelsey Lubaaba Magnus Marissa Mildred Muhsin # 1 8 14 9 3 8 7 5 9 9 6 4 2 6 9 3 4 # Necko Nia Ramziyya Razeena Remy Roger Schyler Shyanne Tatiana # 4 17 5 11 1 6 10 16 3 head(out[order(-out)], 5) # Nia Shyanne Dionicio Razeena Schyler # 17 16 14 11 10

rowsum

答案 1 :(得分:3)

使用tidyverse函数:

library(tidyr)
library(dplyr)

top_5 %>% 
  as.data.frame %>% 
  head(.,5) %>%
  mutate(rank = nrow(.):1) %>% 
  pivot_longer(., -c(rank), values_to = "name", names_to = "col") %>% 
  group_by(name) %>% 
  summarise_at(vars(rank), list(points = sum))

#> # A tibble: 26 x 2
#>    name   points
#>    <fct>   <int>
#>  1 Husni       5
#>  2 Nia        17
#>  3 Razeena    11
#>  4 Schyler    10
#>  5 Shyanne    16
#>  6 German      9
#>  7 Hung        7
#>  8 Necko       4
#>  9 Bradly      8
#> 10 Hanaaa      8
#> # ... with 16 more rows

答案 2 :(得分:3)

这里是一种“转换为长然后按组汇总”的方法,类似于M--的答案,但具有data.table

library(data.table)

df <- as.data.table(top_5)[, points := .N:1]
total_points <- melt(df, 'points')[, .(points = sum(points)), value]
setorder(total_points, -points)

head(total_points, 5)
#       value points
# 1:      Nia     17
# 2:  Shyanne     16
# 3: Dionicio     14
# 4:  Razeena     11
# 5:  Schyler     10

或者与akrun非常相似的想法,只是使用tapply代替sapply + split

out <- sort(tapply(c(6 - row(top_5)), c(top_5), sum), decreasing = TRUE)

head(out, 5)
# Nia  Shyanne Dionicio  Razeena  Schyler 
#  17       16       14       11       10