我有一个看起来像这样的数据集
ID Q1 Q2 Q3
Person1 A C NA
Person2 B C D
Person3 A C A
实质上,它是对一组多项选择题的回答的表。
我一直在尝试找到一种方法,可以在R中为每个人生成响应配置文件。
最终输出看起来像:
A B C D NA
Person1 .33 0 .33 0 .33
Person2 0 .33 .33 .33 0
Person3 .66 0 .33 0 0
我已经尝试过使用crosstab()函数以及使用dplyr和tidyr移动内容的各种方式。我还用Google搜索了“ R频率表”的每个变体,都没有太大的成功。
我错过了一些非常明显的方法吗?
答案 0 :(得分:1)
这里是tidyverse
的一种方式-
df %>%
gather(var, value, -ID) %>%
replace_na(list(value = "Missing")) %>%
count(ID, value) %>%
group_by(ID) %>%
mutate(
prop = n/sum(n)
) %>%
select(-n) %>%
spread(value, prop, fill = 0)
# A tibble: 3 x 6
# Groups: ID [3]
ID A B C D Missing
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Person1 0.333 0 0.333 0 0.333
2 Person2 0 0.333 0.333 0.333 0
3 Person3 0.667 0 0.333 0 0
数据-
df <- read.table(text = "ID Q1 Q2 Q3
Person1 A C NA
Person2 B C D
Person3 A C A", header = T, sep = " ", stringsAsFactors = F)
答案 1 :(得分:0)
这与Shree相似,只是带有步骤注释
library(tidyverse)
df <-
tibble(
ID = paste0("Person", 1:3),
Q1 = c("A", "B", "A"),
Q2 = rep("C", 3),
Q3 = c(NA, "D", "A")
)
df %>%
# this will flip the data from wide to long
# and create 2 new columns "var" and "letter"
# using all the columns not = ID
gather(key = var, value = letter, -ID) %>%
# count how many
group_by(ID) %>%
mutate(total = n()) %>%
ungroup() %>%
# groups by ID & letter & counts, creates a column "n"
# can also use a group by
count(ID, letter, total) %>%
# do the math
mutate(pct = round(n/total, 2)) %>%
# keep just these 3 columns
select(ID, letter, pct) %>%
# the inverse of gather(). Will take the letter column to
# make new columns for each unique value and will put the
# pct values underneath them. Any NA will become a 0
spread(key = letter, value = pct, fill = 0)
# ID A B C D `<NA>`
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# Person1 0.33 0 0.33 0 0.33
# Person2 0 0.33 0.33 0.33 0
# Person3 0.67 0 0.33 0 0
答案 2 :(得分:0)
我先使用melt
,然后使用table
+ prop.table
s=reshape2::melt(df,id.vars='ID')
s[is.na(s)]='NA'
prop.table(table(s$ID,as.character(s$value)),1)
A B C D NA
Person1 0.3333333 0.0000000 0.3333333 0.0000000 0.3333333
Person2 0.0000000 0.3333333 0.3333333 0.3333333 0.0000000
Person3 0.6666667 0.0000000 0.3333333 0.0000000 0.0000000