在ggplot2 r中的散点图中标记最大/最小值

时间:2018-12-23 12:01:19

标签: r ggplot2 dplyr

我有一个很大的数据框,我想在该图中做一个散点图,其中仅标记最大/最小值。

some_df <- data.frame(
   "Sport" = c(1:5), 
   "avg_height" = c(178, 142, 200, 135, 182), 
   "avg_weight" = c(66, 61, 44, 77, 100))

我尝试过:

library(dplyr)
library(ggplot2)
some_df %>% 
  ggplot(aes(avg_weight, avg_height, label = Sport)) + 
  geom_point(shape = 21) + 
  geom_text(data = subset(avg_height == max(avg_height)))     

但是遇到错误,告诉我找不到avg_height

我也尝试过使用geom_text

geom_text(aes(label = ifelse(avg_height=max(avg_height), as.character(Sport), '')), 
          hjust=0, vjust=0)  
找不到

,错误为Sport

因此,我可以全部标记或不标记,但是使用大data.frame,将无法读取。如果我只能为最大/最小值上色,也可以。 我已经尝试过创建一个新列并尝试加入如下所示的新变量,但这并没有帮助我。

maxw <- some_df %>% summarise_each(Max = max(avg_weight))
maxh <- some_df %>% mutate(summarise(Max = max(avg_height)))

我想要的散点图仅具有avg_heigt和avg_weight的最大值和最小值的标签。

1 个答案:

答案 0 :(得分:1)

如果我正确理解,则avg_weightavg_weight的极值的数据点都应该标记为Sport的值:

library(dplyr)
library(ggplot2)
some_df %>% 
  ggplot(aes(avg_weight, avg_height, label = Sport)) + 
  geom_point(shape = 21) + 
  geom_label(data = some_df %>% 
               filter(avg_height %in% range(avg_height) | avg_weight %in% range(avg_weight)),
             nudge_x = 1)

创建

enter image description here

编辑

OP还具有asked来标记最高和最低BMI avg_weight / (avg_height/100)^2的点:

library(dplyr)
library(ggplot2)
# append BMI column to dataset
some_df <- some_df %>% 
  mutate(bmi = avg_weight / (avg_height/100)^2) 
some_df %>% 
  ggplot(aes(avg_weight, avg_height, label = Sport)) + 
  geom_point(shape = 21) + 
  geom_label(data = some_df %>% 
               filter(
                 avg_height %in% range(avg_height) | 
                   avg_weight %in% range(avg_weight) |
                   bmi %in% range(bmi)
               ),
             nudge_x = 1)

结果图表与上面相同。