在ggplot中缺少组标签

时间:2018-02-21 20:54:00

标签: r ggplot2 label

我想使用ggplotstat_summaryh中添加分组标签而不是单个标签。我的数据是这样的:

dat <- read.table(text = "   id2 small_oe xinterceptm startidm endidm medium_oe medium_region
1    1       NA           1        1      1        NA          <NA>
                  2    2     1.66          NA        1      4      1.36        FL-M-4
                  3    3     1.21          NA        1      4      1.36        FL-M-4
                  4    4       NA           4        4      4        NA          <NA>
                  5    5     1.34          NA        4      7      1.17        FL-M-5
                  6    6     0.97          NA        4      7      1.17        FL-M-5
                  7    7       NA           7        7      7        NA          <NA>
                  8    8     1.21          NA        7     10      1.19       FL-M-14
                  9    9     0.91          NA        7     10      1.19       FL-M-14
                  10  10       NA          10       10     10        NA          <NA>
                  11  11     1.34          NA       10     13      1.17       FL-M-13
                  12  12     0.96          NA       10     13      1.17       FL-M-13
                  13  13       NA          13       13     13        NA          <NA>
                  14  14     1.30          NA       13     16      1.20        NY-M-4
                  15  15     1.18          NA       13     16      1.20        NY-M-4
                  16  16       NA          16       16     16        NA          <NA>
                  17  17     0.87          NA       16     18      0.87        NY-M-5
                  18  18       NA          18       18     18        NA          <NA>
                  19  19     1.09          NA       18     20      1.09        NE-M-5
                  20  20       NA          20       20     20        NA          <NA>
                  21  21     1.60          NA       20     22      1.60        FL-M-3
                  22  22       NA          22       22     22        NA          <NA>
                  23  23     1.14          NA       22     25      1.14        FL-M-1
                  24  24     1.12          NA       22     25      1.14        FL-M-1
                  25  25       NA          25       25     25        NA          <NA>
                  26  26     0.71          NA       25     27      0.71        FL-M-2
                  27  27       NA          27       27     27        NA          <NA>
                  28  28     1.16          NA       27     29      1.16       FL-M-12
                  29  29       NA          29       29     29        NA          <NA>
                  30  30     1.14          NA       29     31      1.15       FL-M-11",
                  header = T, stringsAsFactors = F)

当我尝试在ggplot中为每个群组添加标签时,某些群组的标签丢失了。 (情节和我的代码附在这里)我该如何解决?

library(ggplot2)
library(ggstance)

ggplot(dat, aes(x = id2, y = small_oe)) +
  theme(panel.background = element_rect(fill = "white", colour = "grey50")) +
  theme(panel.grid.major = element_blank(),   ## adjust the theme: clean background, remove x-axis labels/values
        panel.grid.minor = element_blank(),
        axis.text.x = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_blank()) +
  geom_vline(aes(xintercept = xinterceptm), linetype = "dotted", alpha = 0.3) +  ## add vertical lines seperating median cluster groups
  geom_segment(aes(x = startidm, xend = endidm, y= medium_oe, yend = medium_oe), alpha = 0.4) + ## add line segments as median region o:e
  stat_summaryh(fun.x = mean, aes(label = medium_region, y = medium_oe+0.02), geom = "text", size = 3, alpha = 0.4, color = "blue")

enter image description here

谢谢!

3 个答案:

答案 0 :(得分:3)

dat$medium_region包含NA&#39; s,因此为空白标签。你应该把它们改成适当的东西。

dat$medium_region <- gsub("<NA>", "Unknown", dat$medium_region)

如果他们没有名字但您想要显示标签,则可能是重命名所有NA的合适方式。

解决方案:

所以我认为这两个标签对于处于相同的y平面(medium_oe值)感到不满意,所以你可以添加一个向量来专门移动这两个标签不对齐。 创建一个名为mod的新列并更改值:

dat$mod <- 0
dat$mod[dat$mod$medium_region == FL-M-5] <- 0.01

然后将stat_summaryh参数更改为:

y = medium_oe + 0.02 + mod

这适用于您的示例数据,但mod需要更改其他数据集中的特定重叠样本。不理想,但我不明白为什么他们不能分享同样的y coord。

这可能与此警告有关:

4: Removed 1 rows containing missing values (geom_text).

答案 1 :(得分:2)

问题仅在于FL-M-5和FL-M-13,因为它们具有完全相同的medium_oe(1.17)。如果你删除一个,那么另一个标签显示正常,例如如果您在原始图表中将foobar分成dat -

foo = dat[c(1:3,7:nrow(dat)),]
bar = dat[c(1:9,13:nrow(dat)),]

这会导致问题,因为stat_summaryh正在尝试为id2的每个唯一值设置一个标签medium_oe,标签对应于medium_oe。但medium_oe == 1.17有两个独特的值。所以它没有在那里贴标签。

我注意到的另一件事是你的情节被标记为y轴应该是small_oe。但是,您正在使用medium_oe处的y值绘制细分,这看起来不是两个small_oe值的平均值。因此,您一定要确保此图表显示您的意图。

答案 2 :(得分:0)

根据Tyr的建议,我在这里提供了详细的解决方案:

1.替换组中的重复值

我的原始数据包含很多组,因此我需要编写代码来自动查找重复值 ,而不是手动输入区域名称:

### group by medium region
dat2 <- dat %>%
  group_by(medium_region) %>%
  summarize(
    mean_mediumoe = mean(medium_oe, na.rm = T)
  )

### find regions with duplicated oe values 
duplicated(dat2$mean_mediumoe)

### create new var = slightly different value if the values are not unique 
dat2$medium_oe2 <- ifelse(duplicated(dat2$mean_mediumoe) == TRUE, dat2$mean_mediumoe+0.0001, dat2$mean_mediumoe)

### merge data and keep original ordering 
dat3 <- merge(dat, dat2, by.x = "medium_region")

dat3 <- arrange(dat3, id2)

2。画图

library(ggplot2)
library(ggstance)

ggplot(dat3, aes(x = id2, y = small_oe)) +
  theme(panel.background = element_rect(fill = "white", colour = "grey50")) +
  theme(panel.grid.major = element_blank(),   ## adjust the theme: clean background, remove x-axis labels/values
        panel.grid.minor = element_blank(),
        axis.text.x = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_blank()) +
  geom_vline(aes(xintercept = xinterceptm), linetype = "dotted", alpha = 0.3) +  ## add vertical lines seperating median cluster groups
  geom_segment(aes(x = startidm, xend = endidm, y= medium_oe, yend = medium_oe), alpha = 0.4) + ## add line segments as median region o:e
  stat_summaryh(fun.x = mean, aes(label = medium_region, y = medium_oe2+0.02), geom = "text", size = 3, alpha = 0.4, color = "blue")

现在每个小组都有它的标签。 enter image description here