我想制作一个包含月度数据的图表:平均值,标准偏差以及最大值和最小值。 X轴是我的月份,我想用点,正方形和十字形表示我的均值,例如,我的标准差用均值符号中的垂直竖线表示。 仍然用一个区域代表我的最大值和最小值。
我想绘制三个不同的时期:1961-1990年,1990-2010年和1961-2010年。
有可能吗?
一些数据:
Mês;Mean1;Std1;Min1;Max1;Mean2;Std2;Min2;Max2;Mean3;Std3;Min3;Max3
Jan;25.45;2.04;13.05;27.50;25.83;1.94;14.01;27.85;25.54;2.03;13.24;27.58
Feb;25.74;2.09;13.02;27.85;26.16;2.01;13.95;28.16;25.92;2.04;13.58;27.99
Mar;25.01;2.13;12.12;27.27;25.35;2.14;12.41;27.67;25.16;2.07;12.68;27.45
Apr;23.16;2.19;9.89;25.48;23.81;2.35;9.62;26.35;23.51;2.17;10.46;25.90
May;21.17;2.21;7.99;23.59;21.31;2.29;7.54;23.88;21.18;2.23;7.84;23.67
Jun;19.88;2.26;6.37;22.34;20.15;2.25;6.65;22.65;20.00;2.26;6.42;22.47
Jul;19.41;2.27;5.78;21.79;19.96;2.10;7.34;22.25;19.60;2.22;6.24;22.02
Aug;20.39;2.10;7.73;22.64;20.75;2.03;8.56;23.00;20.53;2.09;7.93;22.80
Sep;21.08;1.96;9.26;23.29;21.66;1.58;12.21;23.53;21.33;1.91;9.84;23.53
Oct;22.19;1.81;11.33;24.32;23.17;1.62;13.40;25.00;22.60;1.79;11.92;24.76
Nov;23.42;1.90;11.94;25.52;23.89;1.64;13.96;25.68;23.60;1.82;12.63;25.67
Dec;24.39;1.98;12.39;26.39;25.17;1.99;13.07;27.54;24.67;1.94;12.93;26.73
答案 0 :(得分:1)
简短答案:
是的,确实是可能的。
长答案:
这是您的方法:
假设您发布的数据在某个名为df
的data.frame中:
head(df)
Mês Mean1 Std1 Min1 Max1 Mean2 Std2 Min2 Max2 Mean3 Std3 Min3 Max3
1 Jan 25.45 2.04 13.05 27.50 25.83 1.94 14.01 27.85 25.54 2.03 13.24 27.58
2 Feb 25.74 2.09 13.02 27.85 26.16 2.01 13.95 28.16 25.92 2.04 13.58 27.99
3 Mar 25.01 2.13 12.12 27.27 25.35 2.14 12.41 27.67 25.16 2.07 12.68 27.45
4 Apr 23.16 2.19 9.89 25.48 23.81 2.35 9.62 26.35 23.51 2.17 10.46 25.90
5 May 21.17 2.21 7.99 23.59 21.31 2.29 7.54 23.88 21.18 2.23 7.84 23.67
6 Jun 19.88 2.26 6.37 22.34 20.15 2.25 6.65 22.65 20.00 2.26 6.42 22.47
您首先要将其从宽格式转换为长格式,这意味着我们希望每个观察值都有自己的行。也许有更多整洁经验的人可以用一种更优雅的方式做到这一点,但这就是我的方式:
# First we melt the dataframe
df2 <- reshape2::melt(df, id.vars = "Mês")
# Then we get a grouping variable from the column "variable"
df2$variable <- as.character(df2$variable)
df2$group <- substr(df2$variable, nchar(df2$variable), nchar(df2$variable))
# And we remove the trailing number from the variable
df2$variable <- substr(df2$variable, 1, nchar(df2$variable) - 1)
这是此时数据的样子:
head(df2)
Mês variable value group
1 Jan Mean 25.45 1
2 Feb Mean 25.74 1
3 Mar Mean 25.01 1
4 Apr Mean 23.16 1
5 May Mean 21.17 1
6 Jun Mean 19.88 1
我们仍然需要均值,标准差,最小值和最大值位于同一行上,因此我们将按组取消融合(投射)数据:
# First we split by group
df2 <- split(df2, df2$group)
# Then, we loop over the data and cast the data
df2 <- lapply(seq(df2), function(i){
dat <- df2[[i]]
cbind(reshape2::dcast(dat, Mês ~ variable), group = i)
})
# And finally combine the data.frame back together
df2 <- do.call(rbind, df2)
现在数据应如下所示:
head(df2)
Mês Max Mean Min Std group
1 Jan 27.50 25.45 13.05 2.04 1
2 Feb 27.85 25.74 13.02 2.09 1
3 Mar 27.27 25.01 12.12 2.13 1
4 Apr 25.48 23.16 9.89 2.19 1
5 May 23.59 21.17 7.99 2.21 1
6 Jun 22.34 19.88 6.37 2.26 1
这种格式的数据最容易绘制。我们将按照以下步骤进行操作:
# First we define all shared aesthetics in the main 'ggplot'-call:
ggplot(df2, aes(x = Mês,
group = as.factor(group),
colour = as.factor(group))) +
# Then as lowest layer, we want that area spanning 'Min' to 'Max'
geom_ribbon(aes(ymin = Min,
ymax = Max,
fill = as.factor(group)), alpha = 0.1) +
# Then we want our means displayed as points
geom_point(aes(y = Mean, shape = as.factor(group))) +
# The standard deviation as line segments with an arrowhead
geom_segment(aes(xend = Mês,
y = Mean - Std,
yend = Mean + Std),
arrow = arrow(angle = 90, ends = "both", length = unit(2, "mm"))) +
# Finally we tell that our point shapes should be dots, squares and crosses
scale_shape_manual(values = c(16, 15, 4))
在我手中,这产生了以下内容:
现在,作为最后一个提示:如果您想让更多的人来帮助您或更快地获得帮助,最简单的方法就是为他们提供一些数据,以便他们可以直接在R中复制粘贴:
dput(df)
structure(list(Mês = structure(1:12, .Label = c("Jan", "Feb",
"Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov",
"Dec"), class = "factor"), Mean1 = c(25.45, 25.74, 25.01, 23.16,
21.17, 19.88, 19.41, 20.39, 21.08, 22.19, 23.42, 24.39), Std1 = c(2.04,
2.09, 2.13, 2.19, 2.21, 2.26, 2.27, 2.1, 1.96, 1.81, 1.9, 1.98
), Min1 = c(13.05, 13.02, 12.12, 9.89, 7.99, 6.37, 5.78, 7.73,
9.26, 11.33, 11.94, 12.39), Max1 = c(27.5, 27.85, 27.27, 25.48,
23.59, 22.34, 21.79, 22.64, 23.29, 24.32, 25.52, 26.39), Mean2 = c(25.83,
26.16, 25.35, 23.81, 21.31, 20.15, 19.96, 20.75, 21.66, 23.17,
23.89, 25.17), Std2 = c(1.94, 2.01, 2.14, 2.35, 2.29, 2.25, 2.1,
2.03, 1.58, 1.62, 1.64, 1.99), Min2 = c(14.01, 13.95, 12.41,
9.62, 7.54, 6.65, 7.34, 8.56, 12.21, 13.4, 13.96, 13.07), Max2 = c(27.85,
28.16, 27.67, 26.35, 23.88, 22.65, 22.25, 23, 23.53, 25, 25.68,
27.54), Mean3 = c(25.54, 25.92, 25.16, 23.51, 21.18, 20, 19.6,
20.53, 21.33, 22.6, 23.6, 24.67), Std3 = c(2.03, 2.04, 2.07,
2.17, 2.23, 2.26, 2.22, 2.09, 1.91, 1.79, 1.82, 1.94), Min3 = c(13.24,
13.58, 12.68, 10.46, 7.84, 6.42, 6.24, 7.93, 9.84, 11.92, 12.63,
12.93), Max3 = c(27.58, 27.99, 27.45, 25.9, 23.67, 22.47, 22.02,
22.8, 23.53, 24.76, 25.67, 26.73)), row.names = c(NA, -12L), class = "data.frame")