我有一个很大的数据框,我想在该图中做一个散点图,其中仅标记最大/最小值。
some_df <- data.frame(
"Sport" = c(1:5),
"avg_height" = c(178, 142, 200, 135, 182),
"avg_weight" = c(66, 61, 44, 77, 100))
我尝试过:
library(dplyr)
library(ggplot2)
some_df %>%
ggplot(aes(avg_weight, avg_height, label = Sport)) +
geom_point(shape = 21) +
geom_text(data = subset(avg_height == max(avg_height)))
但是遇到错误,告诉我找不到avg_height
。
我也尝试过使用geom_text
geom_text(aes(label = ifelse(avg_height=max(avg_height), as.character(Sport), '')),
hjust=0, vjust=0)
找不到,错误为Sport
。
因此,我可以全部标记或不标记,但是使用大data.frame,将无法读取。如果我只能为最大/最小值上色,也可以。 我已经尝试过创建一个新列并尝试加入如下所示的新变量,但这并没有帮助我。
maxw <- some_df %>% summarise_each(Max = max(avg_weight))
maxh <- some_df %>% mutate(summarise(Max = max(avg_height)))
我想要的散点图仅具有avg_heigt和avg_weight的最大值和最小值的标签。
答案 0 :(得分:1)
如果我正确理解,则avg_weight
和avg_weight
的极值的数据点都应该标记为Sport
的值:
library(dplyr)
library(ggplot2)
some_df %>%
ggplot(aes(avg_weight, avg_height, label = Sport)) +
geom_point(shape = 21) +
geom_label(data = some_df %>%
filter(avg_height %in% range(avg_height) | avg_weight %in% range(avg_weight)),
nudge_x = 1)
创建
OP还具有asked来标记最高和最低BMI avg_weight / (avg_height/100)^2
的点:
library(dplyr)
library(ggplot2)
# append BMI column to dataset
some_df <- some_df %>%
mutate(bmi = avg_weight / (avg_height/100)^2)
some_df %>%
ggplot(aes(avg_weight, avg_height, label = Sport)) +
geom_point(shape = 21) +
geom_label(data = some_df %>%
filter(
avg_height %in% range(avg_height) |
avg_weight %in% range(avg_weight) |
bmi %in% range(bmi)
),
nudge_x = 1)
结果图表与上面相同。