如何向ggplot + geom_point图添加平均值

时间:2018-09-07 07:51:53

标签: r ggplot2

我有10组数据点,并且我试图将平均值添加到要在绘图上显示的每个组上(例如,通过不同的符号,例如大三角形或星形或类似形状)。 这是一个可复制的示例

library(ggplot2)
library(reshape2)
set.seed(1234)

x <- matrix(rnorm(100),10,10)
varnames <- paste("var", seq(1,10))

df <- data.frame(x)
colnames(df) <- varnames
melt(df)

ggplot(data = melt(df)) + geom_point(mapping = aes(x = variable, y = value))
mymeans <- colMeans(df)

基本上,我现在想将mymeans中的值绘制在它们各自的变量位置,有人会知道如何快速执行此操作吗?

4 个答案:

答案 0 :(得分:1)

或者我们可以使用stat_summary

ggplot(data = reshape2::melt(df), aes(x = variable, y = value)) + 
  geom_point() +
  stat_summary(
    geom = "point",
    fun.y = "mean",
    col = "black",
    size = 3,
    shape = 24,
    fill = "red"
  )

enter image description here


有关可能的形状的概述,请参见:www.cookbook-r.com

答案 1 :(得分:1)

更新了代码以反映 tidyverse 中先前评论的变化。

由于 tidyverse 更新了其语法,以下是 dplyrggplot2 的更新版本。谢谢@Vincent Bonhomme 和@markus。

为了重现性,我将复制他们的例子。

library(tidyverse)

# Dataset Generation
set.seed(1234)
df <- replicate(10, rnorm(10)) %>%
  as_data_frame() %>%
  pivot_longer(cols = everything(), names_to = "variable", values_to = "value") %>% # ** Change here   
  mutate(group = as.factor(rep(1:5, 20)))

#Option 1: Use stat_summary() for a cleaner version (@Vincent Bonhomme)
ggplot(df,  aes(x = variable, y = value)) + 
  geom_point() +
  stat_summary(
    fun = "mean",        #argument updated in new version.
    geom = "point",
    col = "black",
    size = 3,
    shape = 24,
    fill = "red"
  ) + 
ggtitle("Example")


#Option 2 -- Creating a means dataset (@ markus)
df_means <- df %>% group_by(variable) %>% summarise(value=mean(value))
ggplot(data = df) + 
  aes(x = variable, y = value) +
  geom_point() + 
  geom_point(data=df_means, 
col="red",  
size = 3,
    shape = 24,
    fill = "red") +
  ggtitle("Example")

两者都创建相同的图表

enter image description here

这里是使用的版本

dplyr       * 1.0.3 
ggplot2     * 3.3.3 

答案 2 :(得分:0)

您可以将另一个nested_json = { "Invalids": [ ... ] } def get_names_and_numbers(json_data): """ Return names and numbers in json_data. """ names_list = [] numbers_list= [] Invalids = nested_json['Invalids'] if Invalids: # *note that Invalids is a list an this referers to its lenght* names_list.append(Invalids[1]['InputRequest']['name']) numbers_list.append(Invalids[1]['InputRequest']['number']) return names_list, numbers_list else: return None, None names, numbers = get_names_and_numbers(nested_json) 传递给另一个geom_point

尝试以下操作:

data.frame

enter image description here

我什至不满足你的要求?


更紧凑/更现代/更完整的方式是:

df_means <- melt(summarise_all(df, mean))
ggplot(data = melt(df)) + 
    geom_point(mapping = aes(x = variable, y = value)) + 
    geom_point(data=df_means,  mapping=aes(x = variable, y = value), col="red")

答案 3 :(得分:0)

我发现使用所有不同的数据而不是使用两个不同的帧会更加干净。

library(ggplot2)
library(tidyr)
library(dplyr)

set.seed(1234)

x <- matrix(rnorm(100),10,10)
varnames <- paste("var", seq(1,10))

df <- data.frame(x)
colnames(df) <- varnames

melt_data = df %>% gather
mymeans = melt_data %>% group_by(key) %>% summarize(value = mean(value))
mymeans$type = 'mean'
melt_data$type = 'points'

ggplot(data = bind_rows(melt_data, mymeans)) +
  geom_point(mapping = aes(x = key, y = value, color=type))