情节组和类别表示group_by

时间:2017-04-17 07:39:14

标签: r ggplot2 group-by

我是R的新手,并试图找出一种方法来绘制单个样本的均值,以及使用ggplot绘制组的方法。 我正在关注R-bloggers的这篇文章(最后一段):

https://www.r-bloggers.com/plotting-individual-observations-and-group-means-with-ggplot2/

这是我的代码:

gd <- meanplot1 %>%
     group_by(treatment, value) %>%
     summarise(measurement = mean(measurement))

ggplot(meanplot1, aes(x=value, y=measurement, color=treatment)) + 
     geom_line(aes(group=sample), alpha=0.3) + 
     geom_line(data=gd, size=3, alpha=0.9) + 
     theme_bw()

虽然显示了样本方法,但该组意味着没有。我收到了错误 geom_path:每组只包含一个观察。你需要 调整群体审美? 添加group = 1后,我得到一个奇怪的混合类别意思,但不是我想要的...

我已经滚动了很多文章,但无法找到答案 - 如果有人能帮助我,我会很高兴的! :)

我的数据(meanplot1)的格式如下:

treatment  sample value measurement
1     control, control 1,     initial,             20,
2     control, control 1,          26,             NA,
3     control, control 1,         26',             28,
12    control, control 2,     initial,             22,
13    control control 2,          26,             NA,
14    control control 2,        26',             36,
15    control control 2,          28,             45,
67   stressed,  stress 1,     initial,             37,
68   stressed,  stress 1,          26,             NA,
69   stressed,  stress 1,         26',             17,
78   stressed,  stress 2,     initial,             36,
79   stressed,  stress 2,          26,             NA,
80   stressed,  stress 2,         26',             25,

我希望看到6行,一个是压力1,压力2,对照1和对照2,一个是所有治疗的平均值=对照,一个是所有治疗=强调

输出dput(gd):

structure(list(treatment = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("control", "stressed"), class = "factor"), value =                 structure(c(1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 8L, 9L, 10L, 11L), .Label = c("26", "26'", "28", "28'", 
"30", "30'", "32", "32'", "34", "34'", "initial"), class = "factor"), 
measurement = c(NA, 32.3333333333333, 39.5, 30.3333333333333, 
31.8333333333333, 31.8333333333333, NA, 36, 34.6666666666667, 
36, 24.6666666666667, NA, 25.3333333333333, 33.3333333333333, 
32, 50.1666666666667, 39.1666666666667, NA, 33.5, 24.3333333333333, 
27.3333333333333, 36)), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -22L), vars = list(treatment),       drop = TRUE, .Names = c("treatment", 
"value", "measurement"))

输出dput(meanplot1):

structure(list(treatment = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label =    c("control", 
"stressed"), class = "factor"), sample = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L), .Label = c("control 1", 
"control 2", "control 3", "control 4", "control 5", "control 6", 
"stress 1", "stress 2", "stress 3", "stress 4", "stress 5", "stress 6"
), class = "factor"), value = structure(c(11L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), .Label = c("26", "26'", 
"28", "28'", "30", "30'", "32", "32'", "34", "34'", "initial"
), class = "factor"), measurement = c(20L, NA, 28L, 18L, 17L, 
19L, 34L, NA, 23L, 29L, 27L, 22L, NA, 36L, 45L, 31L, 40L, 44L, 
NA, 49L, 40L, 39L, 32L, NA, 35L, 57L, 30L, 37L, 29L, NA, 44L, 
37L, 46L, 20L, NA, 39L, 27L, 30L, 40L, 25L, NA, 29L, 50L, 30L, 
26L, NA, 28L, 45L, 47L, 27L, 35L, NA, 24L, 22L, 35L, 28L, NA, 
28L, 45L, 27L, 28L, 24L, NA, 47L, 30L, 39L, 37L, NA, 17L, 29L, 
29L, 31L, 29L, NA, 37L, 21L, 27L, 36L, NA, 25L, 41L, 51L, 66L, 
50L, NA, 33L, 25L, 22L, 36L, NA, 33L, 45L, 26L, 72L, 59L, NA, 
33L, 26L, 25L, 33L, NA, 21L, 33L, 25L, 29L, 21L, NA, 26L, 20L, 
16L, 22L, NA, 30L, 27L, 28L, 57L, 41L, NA, 28L, 23L, 17L, 52L, 
NA, 26L, 25L, 33L, 46L, 35L, NA, 44L, 31L, 57L)), .Names =    c("treatment", 
"sample", "value", "measurement"), class = "data.frame",     row.names = c(NA, 
-132L))

1 个答案:

答案 0 :(得分:0)

我想你的目标是绘制治疗方法。

默认情况下,由于您使用的是分类x轴,因此分组设置为x和颜色之间的交互。但是,您只想通过治疗进行分组。所以我们会在通话中添加正确的分组。

ggplot(meanplot1, aes(x = value, y = measurement, color=treatment)) + 
  geom_line(aes(group=sample), alpha=0.3) + 
  geom_line(aes(group = treatment), gd, size=3, alpha=0.9) + 
  theme_bw()

enter image description here

另请注意

ggplot(meanplot1, aes(x=value, y=measurement, color=treatment)) + 
  geom_line(aes(group=sample), alpha=0.3) + 
  stat_summary(aes(group = treatment), fun.y = mean, geom = 'line', size=3, alpha=0.9) +
  theme_bw()

给出相同的情节,没有中断。