Stop geom_density_ridges from showing non-existent tail values

时间:2018-04-18 18:03:41

标签: r ggplot2 ggridges

When I use geom_density_ridges, the plot often ends up showing long tails of values that don't exist in the data.

Here's an example:

library(tidyverse)
library(ggridges)

data("lincoln_weather")

# Remove all negative values for "Minimum Temperature"
d <- lincoln_weather[lincoln_weather$`Min Temperature [F]`>=0,]

ggplot(d, aes(`Min Temperature [F]`, Month)) +
  geom_density_ridges(rel_min_height=.01)

geom_ridgline As you can see, January, February, and December all show negative temperatures, but there are no negative values in the data at all.

Of course, I can add limits to the x-axis, but that doesn't solve the problem because it just truncates the existing erroneous density.

ggplot(d, aes(`Min Temperature [F]`, Month)) +
  geom_density_ridges(rel_min_height=.01) +
  xlim(0,80)

geom_ridgeline with axis limits Now the plot makes it look like there are zero values for January and February (there are none). It also makes it look like 0 degrees happened often in December, when in reality there was only 1 such day.

How can I fix this?

2 个答案:

答案 0 :(得分:2)

嗯,事实证明我应该更仔细地阅读the documentation。关键部分是:

  

“ggridges包提供了两个主要的geom,geom_ridgeline和   geom_density_ridges。前者采用高度值直接绘制   山脊线,后者首先估计数据密度然后   使用山脊线绘制那些。“

有多种方法可以解决此问题。这是一个:

ggplot(d, aes(`Min Temperature [F]`, Month, height=..density..)) +
  geom_density_ridges(stat = "binline", binwidth=1,
                      draw_baseline = F)

enter image description here

答案 1 :(得分:2)

一种选择是使用stat_density_ridges()代替stat_density()。有些事情stat_density_ridges()无法做到,例如绘制垂直线或重叠点,但另一方面,它可以执行# Remove all negative values for "Minimum Temperature" d <- lincoln_weather[lincoln_weather$`Min Temperature [F]`>=0,] ggplot(d, aes(`Min Temperature [F]`, Month, group = Month, height = ..density..)) + geom_density_ridges(stat = "density", trim = TRUE) 无法做的一些事情,例如修剪分布到数据范围。

ggplot(d, aes(`Min Temperature [F]`, Month)) +
  geom_density_ridges(rel_min_height = 0.01, jittered_points = TRUE,
                      position = position_points_jitter(width = 0.5, height = 0),
                      point_shape = "|", point_size = 2,
                      alpha = 0.7)

enter image description here

作为替代方案,你可以绘制一个点地毯,也许这也符合你的目的:

{{1}}

enter image description here

注意:这两种方法目前无法合并,这需要对统计代码进行一些修改。