Question

我是R的新手，并试图找出一种方法来绘制单个样本的均值，以及使用ggplot绘制组的方法。我正在关注R-bloggers的这篇文章（最后一段）：

https://www.r-bloggers.com/plotting-individual-observations-and-group-means-with-ggplot2/

这是我的代码：

gd <- meanplot1 %>%
     group_by(treatment, value) %>%
     summarise(measurement = mean(measurement))

ggplot(meanplot1, aes(x=value, y=measurement, color=treatment)) + 
     geom_line(aes(group=sample), alpha=0.3) + 
     geom_line(data=gd, size=3, alpha=0.9) + 
     theme_bw()

虽然显示了样本方法，但该组意味着没有。我收到了错误 geom_path：每组只包含一个观察。你需要调整群体审美？添加group = 1后，我得到一个奇怪的混合类别意思，但不是我想要的...

我已经滚动了很多文章，但无法找到答案 - 如果有人能帮助我，我会很高兴的！：）

我的数据（meanplot1）的格式如下：

treatment  sample value measurement
1     control, control 1,     initial,             20,
2     control, control 1,          26,             NA,
3     control, control 1,         26',             28,
12    control, control 2,     initial,             22,
13    control control 2,          26,             NA,
14    control control 2,        26',             36,
15    control control 2,          28,             45,
67   stressed,  stress 1,     initial,             37,
68   stressed,  stress 1,          26,             NA,
69   stressed,  stress 1,         26',             17,
78   stressed,  stress 2,     initial,             36,
79   stressed,  stress 2,          26,             NA,
80   stressed,  stress 2,         26',             25,

我希望看到6行，一个是压力1，压力2，对照1和对照2，一个是所有治疗的平均值=对照，一个是所有治疗=强调

输出dput（gd）：

structure(list(treatment = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("control", "stressed"), class = "factor"), value =                 structure(c(1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 8L, 9L, 10L, 11L), .Label = c("26", "26'", "28", "28'", 
"30", "30'", "32", "32'", "34", "34'", "initial"), class = "factor"), 
measurement = c(NA, 32.3333333333333, 39.5, 30.3333333333333, 
31.8333333333333, 31.8333333333333, NA, 36, 34.6666666666667, 
36, 24.6666666666667, NA, 25.3333333333333, 33.3333333333333, 
32, 50.1666666666667, 39.1666666666667, NA, 33.5, 24.3333333333333, 
27.3333333333333, 36)), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -22L), vars = list(treatment),       drop = TRUE, .Names = c("treatment", 
"value", "measurement"))

输出dput（meanplot1）：

structure(list(treatment = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label =    c("control", 
"stressed"), class = "factor"), sample = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L), .Label = c("control 1", 
"control 2", "control 3", "control 4", "control 5", "control 6", 
"stress 1", "stress 2", "stress 3", "stress 4", "stress 5", "stress 6"
), class = "factor"), value = structure(c(11L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), .Label = c("26", "26'", 
"28", "28'", "30", "30'", "32", "32'", "34", "34'", "initial"
), class = "factor"), measurement = c(20L, NA, 28L, 18L, 17L, 
19L, 34L, NA, 23L, 29L, 27L, 22L, NA, 36L, 45L, 31L, 40L, 44L, 
NA, 49L, 40L, 39L, 32L, NA, 35L, 57L, 30L, 37L, 29L, NA, 44L, 
37L, 46L, 20L, NA, 39L, 27L, 30L, 40L, 25L, NA, 29L, 50L, 30L, 
26L, NA, 28L, 45L, 47L, 27L, 35L, NA, 24L, 22L, 35L, 28L, NA, 
28L, 45L, 27L, 28L, 24L, NA, 47L, 30L, 39L, 37L, NA, 17L, 29L, 
29L, 31L, 29L, NA, 37L, 21L, 27L, 36L, NA, 25L, 41L, 51L, 66L, 
50L, NA, 33L, 25L, 22L, 36L, NA, 33L, 45L, 26L, 72L, 59L, NA, 
33L, 26L, 25L, 33L, NA, 21L, 33L, 25L, 29L, 21L, NA, 26L, 20L, 
16L, 22L, NA, 30L, 27L, 28L, 57L, 41L, NA, 28L, 23L, 17L, 52L, 
NA, 26L, 25L, 33L, 46L, 35L, NA, 44L, 31L, 57L)), .Names =    c("treatment", 
"sample", "value", "measurement"), class = "data.frame",     row.names = c(NA, 
-132L))

Answer 1

我想你的目标是绘制治疗方法。

默认情况下，由于您使用的是分类x轴，因此分组设置为x和颜色之间的交互。但是，您只想通过治疗进行分组。所以我们会在通话中添加正确的分组。

ggplot(meanplot1, aes(x = value, y = measurement, color=treatment)) + 
  geom_line(aes(group=sample), alpha=0.3) + 
  geom_line(aes(group = treatment), gd, size=3, alpha=0.9) + 
  theme_bw()

另请注意

ggplot(meanplot1, aes(x=value, y=measurement, color=treatment)) + 
  geom_line(aes(group=sample), alpha=0.3) + 
  stat_summary(aes(group = treatment), fun.y = mean, geom = 'line', size=3, alpha=0.9) +
  theme_bw()

给出相同的情节，没有中断。

情节组和类别表示group_by

1 个答案: