我有一个Qualtrics选择题,我想用它在R中创建图形。我的数据是有组织的,这样你就可以回答每个问题的多个答案。例如,参与者1选择了多个选择答案1(Q1_1)& 3(Q1_3)。我想在一个条形图中折叠所有答案选项,每个多重响应选项(Q1_1:Q1_3)的一个条形除以回答此问题的答复者数量(在本例中为3)。
df <- structure(list(Participant = 1:3, A = c("a", "a", ""), B = c("", "b", "b"), C = c("c", "c", "c")), .Names = c("Participant", "Q1_1", "Q1_2", "Q1_3"), row.names = c(NA, -3L), class = "data.frame")
我想使用ggplot2,也许是通过Q1_1的某种循环:Q1_3?
答案 0 :(得分:2)
也许这就是你想要的
f <-
structure(
list(
Participant = 1:3,
A = c("a", "a", ""),
B = c("", "b", "b"),
C = c("c", "c", "c")),
.Names = c("Participant", "Q1_1", "Q1_2", "Q1_3"),
row.names = c(NA, -3L),
class = "data.frame"
)
library(tidyr)
library(dplyr)
library(ggplot2)
nparticipant <- nrow(f)
f %>%
## Reformat the data
gather(question, response, starts_with("Q")) %>%
filter(response != "") %>%
## calculate the height of the bars
group_by(question) %>%
summarise(score = length(response)/nparticipant) %>%
## Plot
ggplot(aes(x=question, y=score)) +
geom_bar(stat = "identity")
答案 1 :(得分:0)
以下是使用ddply
包中的dplyr
的解决方案。
# I needed to increase number of participants to ensure it works in every case
df = data.frame(Participant = seq(1:100),
Q1_1 = sample(c("a", ""), 100, replace = T, prob = c(1/2, 1/2)),
Q1_2 = sample(c("b", ""), 100, replace = T, prob = c(2/3, 1/3)),
Q1_3 = sample(c("c", ""), 100, replace = T, prob = c(1/3, 2/3)))
df$answer = paste0(df$Q1_1, df$Q1_2, df$Q1_3)
summ = ddply(df, c("answer"), summarize, freq = length(answer)/nrow(df))
## Re-ordeing of factor levels summ$answer
summ$answer <- factor(summ$answer, levels=c("", "a", "b", "c", "ab", "ac", "bc", "abc"))
# Plot
ggplot(summ, aes(answer, freq, fill = answer)) + geom_bar(stat = "identity") + theme_bw()
注意:如果您有更多与其他问题相关的列(“Q2_1”,“Q2_2”......),则可能会更复杂。在这种情况下,每个问题的熔化数据可能是一个解决方案。
答案 2 :(得分:0)
我认为你想要这样的东西(与堆积条形图的比例):
Participant Q1_1 Q1_2 Q1_3
1 1 a c
2 2 a a c
3 3 c b c
4 4 b d
# ensure that all question columns have the same factor levels, ignore blanks
for (i in 2:4) {
df[,i] <- factor(df[,i], levels = c(letters[1:4]))
}
tdf <- as.data.frame(sapply(df[2:4], function(x)table(x)/sum(table(x))))
tdf$choice <- rownames(tdf)
tdf <- melt(tdf, id='choice')
ggplot(tdf, aes(variable, value, fill=choice)) +
geom_bar(stat='identity') +
xlab('Questions') +
ylab('Proportion of Choice')