Question

我想制作一个包含月度数据的图表：平均值，标准偏差以及最大值和最小值。 X轴是我的月份，我想用点，正方形和十字形表示我的均值，例如，我的标准差用均值符号中的垂直竖线表示。仍然用一个区域代表我的最大值和最小值。

我想绘制三个不同的时期：1961-1990年，1990-2010年和1961-2010年。

有可能吗？

一些数据：

Mês;Mean1;Std1;Min1;Max1;Mean2;Std2;Min2;Max2;Mean3;Std3;Min3;Max3
Jan;25.45;2.04;13.05;27.50;25.83;1.94;14.01;27.85;25.54;2.03;13.24;27.58
Feb;25.74;2.09;13.02;27.85;26.16;2.01;13.95;28.16;25.92;2.04;13.58;27.99
Mar;25.01;2.13;12.12;27.27;25.35;2.14;12.41;27.67;25.16;2.07;12.68;27.45
Apr;23.16;2.19;9.89;25.48;23.81;2.35;9.62;26.35;23.51;2.17;10.46;25.90
May;21.17;2.21;7.99;23.59;21.31;2.29;7.54;23.88;21.18;2.23;7.84;23.67
Jun;19.88;2.26;6.37;22.34;20.15;2.25;6.65;22.65;20.00;2.26;6.42;22.47
Jul;19.41;2.27;5.78;21.79;19.96;2.10;7.34;22.25;19.60;2.22;6.24;22.02
Aug;20.39;2.10;7.73;22.64;20.75;2.03;8.56;23.00;20.53;2.09;7.93;22.80
Sep;21.08;1.96;9.26;23.29;21.66;1.58;12.21;23.53;21.33;1.91;9.84;23.53
Oct;22.19;1.81;11.33;24.32;23.17;1.62;13.40;25.00;22.60;1.79;11.92;24.76
Nov;23.42;1.90;11.94;25.52;23.89;1.64;13.96;25.68;23.60;1.82;12.63;25.67
Dec;24.39;1.98;12.39;26.39;25.17;1.99;13.07;27.54;24.67;1.94;12.93;26.73

Answer 1

简短答案：

是的，确实是可能的。

长答案：

这是您的方法：

假设您发布的数据在某个名为df的data.frame中：

head(df)

  Mês Mean1 Std1  Min1  Max1 Mean2 Std2  Min2  Max2 Mean3 Std3  Min3  Max3
1 Jan 25.45 2.04 13.05 27.50 25.83 1.94 14.01 27.85 25.54 2.03 13.24 27.58
2 Feb 25.74 2.09 13.02 27.85 26.16 2.01 13.95 28.16 25.92 2.04 13.58 27.99
3 Mar 25.01 2.13 12.12 27.27 25.35 2.14 12.41 27.67 25.16 2.07 12.68 27.45
4 Apr 23.16 2.19  9.89 25.48 23.81 2.35  9.62 26.35 23.51 2.17 10.46 25.90
5 May 21.17 2.21  7.99 23.59 21.31 2.29  7.54 23.88 21.18 2.23  7.84 23.67
6 Jun 19.88 2.26  6.37 22.34 20.15 2.25  6.65 22.65 20.00 2.26  6.42 22.47

您首先要将其从宽格式转换为长格式，这意味着我们希望每个观察值都有自己的行。也许有更多整洁经验的人可以用一种更优雅的方式做到这一点，但这就是我的方式：

# First we melt the dataframe
df2 <- reshape2::melt(df, id.vars = "Mês")

# Then we get a grouping variable from the column "variable"
df2$variable <- as.character(df2$variable)
df2$group <- substr(df2$variable, nchar(df2$variable), nchar(df2$variable))

# And we remove the trailing number from the variable
df2$variable <- substr(df2$variable, 1, nchar(df2$variable) - 1)

这是此时数据的样子：

head(df2)

  Mês variable value group
1 Jan     Mean 25.45     1
2 Feb     Mean 25.74     1
3 Mar     Mean 25.01     1
4 Apr     Mean 23.16     1
5 May     Mean 21.17     1
6 Jun     Mean 19.88     1

我们仍然需要均值，标准差，最小值和最大值位于同一行上，因此我们将按组取消融合（投射）数据：

# First we split by group
df2 <- split(df2, df2$group)

# Then, we loop over the data and cast the data
df2 <- lapply(seq(df2), function(i){
  dat <- df2[[i]]
  cbind(reshape2::dcast(dat, Mês ~ variable), group = i)
})

# And finally combine the data.frame back together
df2 <- do.call(rbind, df2)

现在数据应如下所示：

head(df2)

  Mês   Max  Mean   Min  Std group
1 Jan 27.50 25.45 13.05 2.04     1
2 Feb 27.85 25.74 13.02 2.09     1
3 Mar 27.27 25.01 12.12 2.13     1
4 Apr 25.48 23.16  9.89 2.19     1
5 May 23.59 21.17  7.99 2.21     1
6 Jun 22.34 19.88  6.37 2.26     1

这种格式的数据最容易绘制。我们将按照以下步骤进行操作：

# First we define all shared aesthetics in the main 'ggplot'-call:
ggplot(df2, aes(x = Mês, 
                group = as.factor(group), 
                colour = as.factor(group))) +
  # Then as lowest layer, we want that area spanning 'Min' to 'Max'
  geom_ribbon(aes(ymin = Min, 
                  ymax = Max, 
                  fill = as.factor(group)), alpha = 0.1) +
  # Then we want our means displayed as points
  geom_point(aes(y = Mean, shape = as.factor(group))) +
  # The standard deviation as line segments with an arrowhead
  geom_segment(aes(xend = Mês, 
                   y = Mean - Std, 
                   yend = Mean + Std),
               arrow = arrow(angle = 90, ends = "both", length = unit(2, "mm"))) +
  # Finally we tell that our point shapes should be dots, squares and crosses
  scale_shape_manual(values = c(16, 15, 4))

在我手中，这产生了以下内容：

现在，作为最后一个提示：如果您想让更多的人来帮助您或更快地获得帮助，最简单的方法就是为他们提供一些数据，以便他们可以直接在R中复制粘贴：

dput(df)

structure(list(Mês = structure(1:12, .Label = c("Jan", "Feb", 
"Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", 
"Dec"), class = "factor"), Mean1 = c(25.45, 25.74, 25.01, 23.16, 
21.17, 19.88, 19.41, 20.39, 21.08, 22.19, 23.42, 24.39), Std1 = c(2.04, 
2.09, 2.13, 2.19, 2.21, 2.26, 2.27, 2.1, 1.96, 1.81, 1.9, 1.98
), Min1 = c(13.05, 13.02, 12.12, 9.89, 7.99, 6.37, 5.78, 7.73, 
9.26, 11.33, 11.94, 12.39), Max1 = c(27.5, 27.85, 27.27, 25.48, 
23.59, 22.34, 21.79, 22.64, 23.29, 24.32, 25.52, 26.39), Mean2 = c(25.83, 
26.16, 25.35, 23.81, 21.31, 20.15, 19.96, 20.75, 21.66, 23.17, 
23.89, 25.17), Std2 = c(1.94, 2.01, 2.14, 2.35, 2.29, 2.25, 2.1, 
2.03, 1.58, 1.62, 1.64, 1.99), Min2 = c(14.01, 13.95, 12.41, 
9.62, 7.54, 6.65, 7.34, 8.56, 12.21, 13.4, 13.96, 13.07), Max2 = c(27.85, 
28.16, 27.67, 26.35, 23.88, 22.65, 22.25, 23, 23.53, 25, 25.68, 
27.54), Mean3 = c(25.54, 25.92, 25.16, 23.51, 21.18, 20, 19.6, 
20.53, 21.33, 22.6, 23.6, 24.67), Std3 = c(2.03, 2.04, 2.07, 
2.17, 2.23, 2.26, 2.22, 2.09, 1.91, 1.79, 1.82, 1.94), Min3 = c(13.24, 
13.58, 12.68, 10.46, 7.84, 6.42, 6.24, 7.93, 9.84, 11.92, 12.63, 
12.93), Max3 = c(27.58, 27.99, 27.45, 25.9, 23.67, 22.47, 22.02, 
22.8, 23.53, 24.76, 25.67, 26.73)), row.names = c(NA, -12L), class = "data.frame")

如何制作图表来表示均值，标准导数，最大值和最小值？

1 个答案: