我有在类中排序的数据,如本文所述: https://www.r-bloggers.com/from-continuous-to-categorical/ 这样可以更容易地查看哪些值是常见的。在创建这些类之后,我想创建一个条形图,其中包含不同类的频率,我将使用以下示例代码:
set.seed(1)
df.v <- data.frame(val = rnorm(1000, mean(4, sd=2)))
df.v$val.clss <- cut(df.v$val, seq(min(df.v$val), max(df.v$val), 1))
p1 <- ggplot(data = df.v)+
geom_bar(aes(val.clss))
plot(p1)
我无法弄清楚的是,如何在4左右的两个条之间精确地添加垂直线,因此该线完全在x轴值处。 我找到了这篇文章,但这对我没有帮助: How to get a vertical geom_vline to an x-axis of class date? 任何帮助表示赞赏。也许我太新了,无法使解决方案适应我的data.frame,如果是这样,请原谅这个问题。
答案 0 :(得分:3)
你想要这样的东西吗?
p1 <- ggplot(data = df.v)+
geom_bar(aes(val.clss)) + geom_vline(xintercept = 3.5, col='red', lwd=2)
plot(p1)
更通用的解决方案可能是这样的:
df.v <- data.frame(val = rnorm(1000, mean=15, sd=4))
df.v$val.clss <- cut(df.v$val, seq(min(df.v$val), max(df.v$val), 1))
lvls <- levels(df.v$val.clss)
lvls
[1] "(2.97,3.97]" "(3.97,4.97]" "(4.97,5.97]" "(5.97,6.97]" "(6.97,7.97]" "(7.97,8.97]" "(8.97,9.97]" "(9.97,11]" "(11,12]" "(12,13]"
[11] "(13,14]" "(14,15]" "(15,16]" "(16,17]" "(17,18]" "(18,19]" "(19,20]" "(20,21]" "(21,22]" "(22,23]"
[21] "(23,24]" "(24,25]" "(25,26]" "(26,27]" "(27,28]" "(28,29]" "(29,30]"
vline.level <- '(18,19]' # you want to draw line here, right before 18
p1 <- ggplot(data = df.v)+
+ geom_bar(aes(val.clss)) + geom_vline(xintercept = which(lvls == vline.level) - 0.5, col='red', lwd=2) +
+ theme(axis.text.x = element_text(angle=90, vjust = 0.5))
plot(p1)
如果您想选择最中间的水平,
length(lvls)
#[1] 27
# choose the middlemost level, since length(lvls) is odd in this case, the midpoint will be ceiling(length(lvls)/2)
vline.level <- lvls[ceiling(length(lvls)/2)]
p1 <- ggplot(data = df.v)+
geom_bar(aes(val.clss)) + geom_vline(xintercept = which(lvls == vline.level) - 0.5, col='red', lwd=2) +
theme(axis.text.x = element_text(angle=90, vjust = 0.5))
plot(p1)
答案 1 :(得分:3)
如果您知道两条线的标签,您希望线条介于两者之间,您可以将它们的位置转换为数字(它们映射到的数字),然后传递:
myLoc <-
(which(levels(df.v$val.clss) == "(2.99,3.99]") +
which(levels(df.v$val.clss) == "(3.99,4.99]")) /
2
p1 +
geom_vline(aes(xintercept = myLoc))
如果是跳过组,您应该确保绘制所有级别的因子。当您对连续数据进行分级时,最好不要删除中间级别。
p1 +
geom_vline(aes(xintercept = myLoc)) +
scale_x_discrete(drop = FALSE)
或者,您可以将数据中的缺失级别全部放在一起(在绘制和计算myLoc
之前):
df.v <- droplevels(df.v)
然后它将仅包括将被绘制的内容。
作为最后一个选项,您可以使用自动进行分箱的geom_histogram
,但会将数据保留为原始比例,这样可以更轻松地添加一行。
ggplot(df.v
, aes(val)) +
geom_histogram(binwidth = 1) +
geom_vline(xintercept = 4)