我想画出连续变量和分类变量(geom_boxplot
和ggplot2
)之间的关系的箱线图,这适用于几种情况(facet_wrap
)。很简单:
data("CO2")
ggplot(CO2, aes(Treatment, uptake) ) +
geom_boxplot(aes(Treatment, uptake),
col="black", fill="white", alpha=0, width=.5) +
geom_point(col="black", size=1.2) +
facet_wrap(~Type, ncol=3, nrow=6, scales= "free_y") +
theme_bw() +
ylab("Uptake")
这个玩具数据集非常不错,但是应用于我自己的数据(其中facet_wrap使我可以绘制18个不同的图)y轴几乎不可读,其y标记的数量和间距都不同:
什么是协调y轴的好方法? (即,无论中断是什么,在y轴刻度线之间获取相等的间距-由于连续变量的变化范围变化很大,因此这些位置必定会从一个图变为另一个图)
非常感谢您的帮助:)
答案 0 :(得分:4)
通过在y轴值上应用pretty()
并扩展第一个值/最后一个值,可以手动扩展每个构面的值,从而将每个构面的限制强制为相对漂亮的外观。
以下是使用钻石数据集的示例:
# normal facet_wrap plot with many different y-axis scales across facets
p <- ggplot(diamonds %>% filter(cut %in% c("Fair", "Ideal")),
aes(x = cut, y = carat) ) +
geom_boxplot(col="black", fill="white", alpha=0, width=.5) +
geom_point(col="black", size=1.2) +
facet_wrap(~clarity, scales= "free_y", nrow = 2) +
theme_bw() +
ylab("Uptake")
p
# modified plot with consistent label placements
p +
# Manually create values to expand the scale, by finding "pretty"
# values that are slightly larger than the range of y-axis values
# within each facet; set alpha = 0 since they aren't meant to be seen
geom_point(data = . %>%
group_by(clarity) %>% #group by facet variable
summarise(y.min = pretty(carat)[1],
y.max = pretty(carat)[length(pretty(carat))]) %>%
tidyr::gather(key, value, -clarity),
aes(x = 1, y = value),
inherit.aes = FALSE, alpha = 0) +
# Turn off automatical scale expansion, & manually set scale breaks
# as an evenly spaced sequence (with the "pretty" values created above
# providing the limits for each facet). If there are many facets to
# show, I recommend no more than 3 labels in each facet, to keep things
# simple.
scale_y_continuous(breaks = function(x) seq(from = x[1],
to = x[2],
length.out = 3),
expand = c(0, 0))
答案 1 :(得分:0)
只需将scales= "free_y"
内的geom_point
移开,即可得到所需的东西。
但是,MrFlick在评论中正确地指出了这一点,即使间距也绝对会导致轴上出现疯狂的怪异数字