我有一个图,其中我显示了两个不同人群中几个二元变量的样本均值。它目前看起来像这样:
我重新构建了我的数据以构建此图,因此该图的数据和代码如下所示:
head(cat)
hd var valor
1 1 gen 1
2 1 gen 0
3 1 gen 0
4 1 gen 0
5 1 gen 0
6 1 gen 0
# This is my code
ggplot(cat, aes(y = valor, x = as.factor(var), group = hd)) +
geom_bar(aes(fill = hd),
stat = 'summary',
fun.y = mean,
position = 'dodge') +
stat_summary(fun.data = mean_cl_normal,
geom = 'errorbar',
position = position_dodge(width = 0.85),
width = 0.2) +
scale_x_discrete(labels = c('abogado_pub' = 'Public Lawyer',
'codem' = 'Co-defendant',
'gen' = 'Gender',
'indem' = 'Severance Pay',
'reinst' = 'Reinstatement',
'sarimssinf' = 'Social Security',
'trabajador_base' = 'At-will worker')) +
scale_y_continuous(labels = scales::percent_format()) +
labs(y = 'Percent', x = 'Variable') +
scale_fill_manual(values = c('gray77', 'gray53'),
name = '',
labels = c('Pilot Data', 'Historic Data')) +
theme_classic() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
对于每对条形图,我想添加代表双面测试的显着性水平的恒星,以获得平均值的差异。我尝试了几个像this one这样的解决方案,但注释从未在图中显示过。我猜测stat = summary
图层与stat = identity
的组合缺少某些内容,但我无法理解它是什么。我也在看this solution,但我不知道是否有可能为我的问题做这样的事情,因为我的注释意味着丢弃一个分组级别。
一些会话信息:
R version 3.4.0 (2017-04-21)
ggplot2_2.2.1
ggsignif_0.3.0
谢谢!
*********************编辑************************* ********
生成数据集样本的可重现示例:
set.seed(140692)
cat = data.frame( hd = sample (c(1,0), 70, replace = T),
var = rep(c('abogado_pub', 'codem', 'gen', 'indem', 'reinst', 'sarimssinf', 'trabajador_base'), 20),
valor = sample (c(1,0), 70, replace = T))