我是R的新手,并试图找出一种方法来绘制单个样本的均值,以及使用ggplot绘制组的方法。 我正在关注R-bloggers的这篇文章(最后一段):
https://www.r-bloggers.com/plotting-individual-observations-and-group-means-with-ggplot2/
这是我的代码:
gd <- meanplot1 %>%
group_by(treatment, value) %>%
summarise(measurement = mean(measurement))
ggplot(meanplot1, aes(x=value, y=measurement, color=treatment)) +
geom_line(aes(group=sample), alpha=0.3) +
geom_line(data=gd, size=3, alpha=0.9) +
theme_bw()
虽然显示了样本方法,但该组意味着没有。我收到了错误 geom_path:每组只包含一个观察。你需要 调整群体审美? 添加group = 1后,我得到一个奇怪的混合类别意思,但不是我想要的...
我已经滚动了很多文章,但无法找到答案 - 如果有人能帮助我,我会很高兴的! :)
我的数据(meanplot1)的格式如下:
treatment sample value measurement
1 control, control 1, initial, 20,
2 control, control 1, 26, NA,
3 control, control 1, 26', 28,
12 control, control 2, initial, 22,
13 control control 2, 26, NA,
14 control control 2, 26', 36,
15 control control 2, 28, 45,
67 stressed, stress 1, initial, 37,
68 stressed, stress 1, 26, NA,
69 stressed, stress 1, 26', 17,
78 stressed, stress 2, initial, 36,
79 stressed, stress 2, 26, NA,
80 stressed, stress 2, 26', 25,
我希望看到6行,一个是压力1,压力2,对照1和对照2,一个是所有治疗的平均值=对照,一个是所有治疗=强调
输出dput(gd):
structure(list(treatment = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("control", "stressed"), class = "factor"), value = structure(c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L), .Label = c("26", "26'", "28", "28'",
"30", "30'", "32", "32'", "34", "34'", "initial"), class = "factor"),
measurement = c(NA, 32.3333333333333, 39.5, 30.3333333333333,
31.8333333333333, 31.8333333333333, NA, 36, 34.6666666666667,
36, 24.6666666666667, NA, 25.3333333333333, 33.3333333333333,
32, 50.1666666666667, 39.1666666666667, NA, 33.5, 24.3333333333333,
27.3333333333333, 36)), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -22L), vars = list(treatment), drop = TRUE, .Names = c("treatment",
"value", "measurement"))
输出dput(meanplot1):
structure(list(treatment = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("control",
"stressed"), class = "factor"), sample = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L), .Label = c("control 1",
"control 2", "control 3", "control 4", "control 5", "control 6",
"stress 1", "stress 2", "stress 3", "stress 4", "stress 5", "stress 6"
), class = "factor"), value = structure(c(11L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), .Label = c("26", "26'",
"28", "28'", "30", "30'", "32", "32'", "34", "34'", "initial"
), class = "factor"), measurement = c(20L, NA, 28L, 18L, 17L,
19L, 34L, NA, 23L, 29L, 27L, 22L, NA, 36L, 45L, 31L, 40L, 44L,
NA, 49L, 40L, 39L, 32L, NA, 35L, 57L, 30L, 37L, 29L, NA, 44L,
37L, 46L, 20L, NA, 39L, 27L, 30L, 40L, 25L, NA, 29L, 50L, 30L,
26L, NA, 28L, 45L, 47L, 27L, 35L, NA, 24L, 22L, 35L, 28L, NA,
28L, 45L, 27L, 28L, 24L, NA, 47L, 30L, 39L, 37L, NA, 17L, 29L,
29L, 31L, 29L, NA, 37L, 21L, 27L, 36L, NA, 25L, 41L, 51L, 66L,
50L, NA, 33L, 25L, 22L, 36L, NA, 33L, 45L, 26L, 72L, 59L, NA,
33L, 26L, 25L, 33L, NA, 21L, 33L, 25L, 29L, 21L, NA, 26L, 20L,
16L, 22L, NA, 30L, 27L, 28L, 57L, 41L, NA, 28L, 23L, 17L, 52L,
NA, 26L, 25L, 33L, 46L, 35L, NA, 44L, 31L, 57L)), .Names = c("treatment",
"sample", "value", "measurement"), class = "data.frame", row.names = c(NA,
-132L))
答案 0 :(得分:0)
我想你的目标是绘制治疗方法。
默认情况下,由于您使用的是分类x轴,因此分组设置为x和颜色之间的交互。但是,您只想通过治疗进行分组。所以我们会在通话中添加正确的分组。
ggplot(meanplot1, aes(x = value, y = measurement, color=treatment)) +
geom_line(aes(group=sample), alpha=0.3) +
geom_line(aes(group = treatment), gd, size=3, alpha=0.9) +
theme_bw()
另请注意
ggplot(meanplot1, aes(x=value, y=measurement, color=treatment)) +
geom_line(aes(group=sample), alpha=0.3) +
stat_summary(aes(group = treatment), fun.y = mean, geom = 'line', size=3, alpha=0.9) +
theme_bw()
给出相同的情节,没有中断。