R ggplot2 ggrepel-在知道所有点的同时标记点的子集

时间:2018-09-19 02:57:25

标签: r ggplot2 plot ggrepel

我有一个相当密集的散点图,我正在用R'ggplot2'构建,我想使用'ggrepel'标记点的子集。我的问题是我想在散点图中绘制所有点,但只用ggrepel标记子集,而当我这样做时,ggrepel在计算放置标签的位置时并没有考虑图上的其他点,这导致标记与图上其他点重叠的标签(我不想标记)。

这是说明问题的示例图。

# generate data:
library(data.table)
library(stringi)
set.seed(20180918)
dt = data.table(
  name = stri_rand_strings(3000,length=6),
  one = rnorm(n = 3000,mean = 0,sd = 1),
  two = rnorm(n = 3000,mean = 0,sd = 1))
dt[, diff := one -two]
dt[, diff_cat := ifelse(one > 0 & two>0 & abs(diff)>1, "type_1",
                        ifelse(one<0 & two < 0 & abs(diff)>1, "type_2",
                               ifelse(two>0 & one<0 & abs(diff)>1, "type_3",
                                      ifelse(two<0 & one>0 & abs(diff)>1, "type_4", "other"))))]

# make plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
  geom_point()

plot without labels

如果仅绘制要标记的点的子集,则ggrepel能够相对于其他点和标签以不重叠的方式放置所有标签。

ggplot(dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))], 
  aes(x=one,y=two,color=diff_cat))+
  geom_point()+
  geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))], 
                  aes(x=one,y=two,label=name))

plot labelled points only

但是,当我想同时绘制此数据子集和原始数据时,会出现带有标签的重叠点:

# now add labels to a subset of points on the plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
  geom_point()+
  geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))], 
                  aes(x=one,y=two,label=name))

plot with labels

如何获取点子集的标签,使其与原始数据中的点不重叠?

1 个答案:

答案 0 :(得分:8)

您可以尝试以下操作:

  1. 为原始数据中的所有其他点分配一个空白标签(""),以便geom_text_repel在相互排斥标签时将它们考虑在内;
  2. box.padding参数从默认的0.25增加到更大的值,以使标签之间的距离更大;
  3. 增加x和y轴的限制,以使标签的四个侧面有更多的空间可以抵挡。

示例代码(带有box.padding = 1):

ggplot(dt, 
       aes(x = one, y = two, color = diff_cat)) +
  geom_point() +
  geom_text_repel(data = . %>% 
                    mutate(label = ifelse(diff_cat %in% c("type_1", "type_2") & abs(diff) > 2,
                                          name, "")),
                  aes(label = label), 
                  box.padding = 1,
                  show.legend = FALSE) + #this removes the 'a' from the legend
  coord_cartesian(xlim = c(-5, 5), ylim = c(-5, 5)) +
  theme_bw()

plot

这是另一种尝试,使用box.padding = 2

plot 2

(注意:我使用的是ggrepel 0.8.0。我不确定早期软件包的功能是否全部存在。)