我正在尝试使用多个变量创建堆积条形图,但我遇到两个问题:
1)我似乎无法使旋转的y轴显示百分比而不是计数。
2)我想根据“非常同意”的回复百分比对变量(desc)进行排序。
以下是我到目前为止的一个例子:
require(scales)
require(ggplot2)
require(reshape2)
# create data frame
my.df <- data.frame(replicate(10, sample(1:4, 200, rep=TRUE)))
my.df$id <- seq(1, 200, by = 1)
# melt
melted <- melt(my.df, id.vars="id")
# factors
melted$value <- factor(melted$value,
levels=c(1,2,3,4),
labels=c("strongly disagree",
"disagree",
"agree",
"strongly agree"))
# plot
ggplot(melted) +
geom_bar(aes(variable, fill=value, position="fill")) +
scale_fill_manual(name="Responses",
values=c("#EFF3FF", "#BDD7E7", "#6BAED6",
"#2171B5"),
breaks=c("strongly disagree",
"disagree",
"agree",
"strongly agree"),
labels=c("strongly disagree",
"disagree",
"agree",
"strongly agree")) +
labs(x="Items", y="Percentage (%)", title="my title") +
coord_flip()
我要感谢几个人帮助我们做到这一点。以下是Google提供的众多网页中的一小部分:
http://www.r-bloggers.com/fumblings-with-ranked-likert-scale-data-in-r/
Create stacked barplot where each stack is scaled to sum to 100%
sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reshape2_1.2.2 ggplot2_0.9.2.1 scales_0.2.2
loaded via a namespace (and not attached):
[1] colorspace_1.2-0 dichromat_1.2-4 digest_0.6.0 grid_2.15.0 gtable_0.1.1 HH_2.3-23
[7] labeling_0.1 lattice_0.20-10 latticeExtra_0.6-24 MASS_7.3-22 memoise_0.1 munsell_0.4
[13] plyr_1.7.1 proto_0.3-9.2 RColorBrewer_1.0-5 rstudio_0.97.237 stringr_0.6.1 tools_2.15.0
答案 0 :(得分:4)
由于您正在使用Likert数据,因此您可能需要考虑HH包中的likert()
函数。 (希望你可以指出另一个方向,因为已经有一个很好的答案来解决你原来的ggplot2方法。)
正如人们可能希望的那样,likert()
以适当的方式绘制,只需要很少的斗争。 PositiveOrder=TRUE
会根据项目向正方向延伸的距离对项目进行排序。 ReferenceZero
参数将允许您在中性项目的中间零中心(下面不需要shown here)。并且as.percent=TRUE
会将计数转换为百分数并列出边距中的实际计数(除非我们将其关闭)。
library(reshape2)
library(HH)
# create data as before
my.df <- data.frame(replicate(10, sample(1:4, 200, rep=TRUE)))
my.df$id <- seq(1, 200, by = 1)
# melt() and dcast() with reshape2 package
melted <- melt(my.df,id.var="id", na.rm=TRUE)
summd <- dcast(data=melted,variable~value, length) # note: length()
# not robust if NAs present
# give names to cols and rows for likert() to use
names(summd) <- c("Question", "strongly disagree",
"disagree",
"agree",
"strongly agree")
rownames(summd) <- summd[,1] # question number as rowname
summd[,1] <- NULL
# plot
likert(summd,
as.percent=TRUE, # automatically scales
main = NULL, # or give "title",
xlab = "Percent", # label axis
positive.order = TRUE, # orders by furthest right
ReferenceZero = 2.5, # zero point btwn levels 2&3
ylab = "Question", # label for left side
auto.key = list(space = "right", columns = 1,
reverse = TRUE)) # make positive items on top of legend
答案 1 :(得分:3)
对于(1)
要获得百分比,您必须从data.frame
创建melted
。至少那是我能想到的方式。
# 200 is the total sum always. Using that to get the percentage
require(plyr)
df <- ddply(melted, .(variable, value), function(x) length(x$value)/200 * 100)
然后在weights
中将计算出的百分比作为geom_bar
提供,如下所示:
ggplot(df) +
geom_bar(aes(variable, fill=value, weight=V1, position="fill")) +
scale_fill_manual(name="Responses",
values=c("#EFF3FF", "#BDD7E7", "#6BAED6",
"#2171B5"),
breaks=c("strongly disagree",
"disagree",
"agree",
"strongly agree"),
labels=c("strongly disagree",
"disagree",
"agree",
"strongly agree")) +
labs(x="Items", y="Percentage (%)", title="my title") +
coord_flip()
我不太明白(2)。你想(a)计算relative percentages
(参考为“非常同意”?或者(b)你是否希望情节总是显示“非常同意”,然后“同意”等等。你可以完成(b)仅通过重新排序df中的因子,
df$value <- factor(df$value, levels=c("strongly agree", "agree", "disagree",
"strongly disagree"), ordered = TRUE)
Edit:
您可以按照以下顺序将variable
和value
的级别重新排序:
variable.order <- names(sort(daply(df, .(variable),
function(x) x$V1[x$value == "strongly agree"] ),
decreasing = TRUE))
value.order <- c("strongly agree", "agree", "disagree", "strongly disagree")
df$variable <- factor(df$variable, levels = variable.order, ordered = TRUE)
df$value <- factor(df$value, levels = value.order, ordered = TRUE)