使用stat ='count'时用ggplot(geom_line)分隔行

时间:2017-09-18 18:51:08

标签: r plot ggplot2

我目前有一些基本上是因素和日期的数据。这是一个简化的概念。

date <- c(1901,1901,1901,1902,1902,1902,1901,1903,1902,1904,1902,1903,1903,1904,1905,       1901,1903,1902,1904,1902,1902,1903,1904,1902,1902,1901,1903,1903,1904,1905, 1905,1906,1907,1908,1901,1908,1907,1905,1906,1902,1903,1903,1903,1904,1905,1901,1901,1901,1902,1902,1902,1901,1903,1902,1904,1902,1903,1903,1904,1905,
1901,1903,1902,1904,1902,1902,1903,1904,1902,1902,1901,1903,1903,1904,1905,
1905,1906,1907,1908,1901,1908,1907,1905,1906,1902,1903,1903,1903,1904,1905,
1905,1906,1907,1908,1901,1908,1907,1920,1920,1920,1921,1921,1921,1921,1921)

genre <- sample(c("fiction","nonfiction"),105,replace=TRUE)
data <- as.data.frame(cbind(date,genre))
# I know this is not an ideal way to coerce to a numeric 
data$date <- as.numeric(as.character(data$date))

到目前为止,这么好。然而,正如你所注意到的那样,如果你将它绘制出来,那么这条线模糊的数据就会有很大的差距。这个情节将说明。

library(ggplot2)
ggplot(data,aes(x=date,color=genre)) + geom_line(stat='count')

Example Plot 1.

我看到this post建议添加一个组,我可以这样做。

data$group <- ifelse(data$date < 1910,1,2)
ggplot(data,aes(x=date,color=genre,group=group)) + geom_line(stat='count')

Example Plot 2

所以似乎没有办法保留我想要输出的颜色美学使用group指定stat='count' 。例如,这个图很好地显示了数据中的差距,但是基于genre因素丢失了颜色/区别:

ggplot(data,aes(x=date,color=genre,group=group)) + geom_line(stat='count')

那么,这不可能吗?我错过了什么吗?有没有更好的方法来做到这一点,或者我需要summarize或以其他方式改变我的约会,以便我在绘图阶段不依赖stat='count'

1 个答案:

答案 0 :(得分:3)

你可以结合&#34;流派&#34;和&#34;组&#34;用作group变量。在这里,我通过interaction函数执行此操作。

ggplot(data,aes(x = date, color = genre, group = interaction(genre, group))) + 
     geom_line(stat = 'count')

enter image description here