假设我在调查问卷中有5个A, B, C, D, E
项,并让受访者对其进行排名。数据看起来像这样,
> df
rank1 rank2 rank3 rank4 rank5
1 A B C D E
2 A C B D E
3 C A B E D
4 B A C D E
5 A B D C E
如何按项目计算每个等级的频率,使输出看起来像这样,
item rank1 rank2 rank3 rank4 rank5
1 A 3 2 0 0 0
2 B 1 2 2 0 0
3 C 1 1 2 1 0
4 D 0 0 1 3 1
5 E 0 0 0 1 4
答案 0 :(得分:3)
我们可以在使用table
factor
后使用base R
lvls <- sort(unique(unlist(df)))
sapply(df, function(x) table(factor(x, levels =lvls)))
# rank1 rank2 rank3 rank4 rank5
#A 3 2 0 0 0
#B 1 2 2 0 0
#C 1 1 2 1 0
#D 0 0 1 3 1
#E 0 0 0 1 4
或仅拨打table
table(unlist(df), c(col(df)))
# 1 2 3 4 5
# A 3 2 0 0 0
# B 1 2 2 0 0
# C 1 1 2 1 0
# D 0 0 1 3 1
# E 0 0 0 1 4
或紧跟mtabulate
qdapTools
library(qdapTools)
t(mtabulate(df))
df <- structure(list(rank1 = c("A", "A", "C", "B", "A"), rank2 = c("B",
"C", "A", "A", "B"), rank3 = c("C", "B", "B", "C", "D"), rank4 = c("D",
"D", "E", "D", "C"), rank5 = c("E", "E", "D", "E", "E")), .Names = c("rank1",
"rank2", "rank3", "rank4", "rank5"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
答案 1 :(得分:1)
整洁的方法
这是一种通过tidyverse
解决问题的方法
library(tidyr)
library(dplyr)
your_data <- tribble(~"rank1", ~"rank2", ~"rank3", ~"rank4", ~"rank5",
"A", "B", "C", "D", "E",
"A", "C", "B", "D", "E",
"C", "A", "B", "E", "D",
"B", "A", "C", "D", "E",
"A", "B", "D", "C", "E")
your_data %>%
gather(key = rank_number, value = rank) %>%
count(rank_number, rank) %>%
spread(key = rank_number, value = n, fill = 0)
#> # A tibble: 5 x 6
#> rank rank1 rank2 rank3 rank4 rank5
#> * <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A 3. 2. 0. 0. 0.
#> 2 B 1. 2. 2. 0. 0.
#> 3 C 1. 1. 2. 1. 0.
#> 4 D 0. 0. 1. 3. 1.
#> 5 E 0. 0. 0. 1. 4.