x轴上的geom_vline垂直线,带有分类数据:ggplot2

时间:2016-10-03 17:04:19

标签: r ggplot2 categorical-data

我有在类中排序的数据,如本文所述: https://www.r-bloggers.com/from-continuous-to-categorical/ 这样可以更容易地查看哪些值是常见的。在创建这些类之后,我想创建一个条形图,其中包含不同类的频率,我将使用以下示例代码:

set.seed(1)
df.v <- data.frame(val = rnorm(1000, mean(4, sd=2)))
df.v$val.clss <- cut(df.v$val, seq(min(df.v$val), max(df.v$val), 1))
p1 <- ggplot(data = df.v)+
  geom_bar(aes(val.clss))
plot(p1)

我无法弄清楚的是,如何在4左右的两个条之间精确地添加垂直线,因此该线完全在x轴值处。 我找到了这篇文章,但这对我没有帮助: How to get a vertical geom_vline to an x-axis of class date? 任何帮助表示赞赏。也许我太新了,无法使解决方案适应我的data.frame,如果是这样,请原谅这个问题。

2 个答案:

答案 0 :(得分:3)

你想要这样的东西吗?

p1 <- ggplot(data = df.v)+
  geom_bar(aes(val.clss)) + geom_vline(xintercept = 3.5, col='red', lwd=2)
plot(p1)

enter image description here

更通用的解决方案可能是这样的:

df.v <- data.frame(val = rnorm(1000, mean=15, sd=4))
df.v$val.clss <- cut(df.v$val, seq(min(df.v$val), max(df.v$val), 1))

lvls <- levels(df.v$val.clss)
lvls
[1] "(2.97,3.97]" "(3.97,4.97]" "(4.97,5.97]" "(5.97,6.97]" "(6.97,7.97]" "(7.97,8.97]" "(8.97,9.97]" "(9.97,11]"   "(11,12]"     "(12,13]"    
[11] "(13,14]"     "(14,15]"     "(15,16]"     "(16,17]"     "(17,18]"     "(18,19]"     "(19,20]"     "(20,21]"     "(21,22]"     "(22,23]"    
[21] "(23,24]"     "(24,25]"     "(25,26]"     "(26,27]"     "(27,28]"     "(28,29]"     "(29,30]"    

vline.level <- '(18,19]' # you want to draw line here, right before 18

p1 <- ggplot(data = df.v)+
+   geom_bar(aes(val.clss)) + geom_vline(xintercept = which(lvls == vline.level) - 0.5, col='red', lwd=2) +
+   theme(axis.text.x = element_text(angle=90, vjust = 0.5))
plot(p1)

enter image description here

如果您想选择最中间的水平,

length(lvls)
#[1] 27
# choose the middlemost level, since length(lvls) is odd in this case, the midpoint will be ceiling(length(lvls)/2)
vline.level <- lvls[ceiling(length(lvls)/2)] 

p1 <- ggplot(data = df.v)+
  geom_bar(aes(val.clss)) + geom_vline(xintercept = which(lvls == vline.level) - 0.5, col='red', lwd=2) +
  theme(axis.text.x = element_text(angle=90, vjust = 0.5))
plot(p1)

enter image description here

答案 1 :(得分:3)

如果您知道两条线的标签,您希望线条介于两者之间,您可以将它们的位置转换为数字(它们映射到的数字),然后传递:

myLoc <- 
  (which(levels(df.v$val.clss) == "(2.99,3.99]") +
     which(levels(df.v$val.clss) == "(3.99,4.99]")) / 
  2


p1 +
  geom_vline(aes(xintercept = myLoc))

如果是跳过组,您应该确保绘制所有级别的因子。当您对连续数据进行分级时,最好不要删除中间级别。

p1 +
  geom_vline(aes(xintercept = myLoc)) +
  scale_x_discrete(drop = FALSE)

或者,您可以将数据中的缺失级别全部放在一起(在绘制和计算myLoc之前):

df.v <- droplevels(df.v)

然后它将仅包括将被绘制的内容。

作为最后一个选项,您可以使用自动进行分箱的geom_histogram,但会将数据保留为原始比例,这样可以更轻松地添加一行。

ggplot(df.v
       , aes(val)) +
  geom_histogram(binwidth = 1) +
  geom_vline(xintercept = 4)