如何将stat_sum应用于特定分组

时间:2013-03-19 14:56:33

标签: r ggplot2

我有以下代码生成以下图表。

colvec <-c("white", "white","gray85", "gray85", "gray58", "gray58", "gray33", "gray33","black", "black") 

ggplot(nut, aes(Date, Nitrate, group=Wetland, shape=Hydrology)) +
  geom_point(aes(fill=Wetland), colour="black", size=4)+
  scale_fill_manual(values=colvec) +
  scale_shape_manual(values=c(21,22))+
  facet_grid(. ~ Hydrology) +
  ylab ("Nitrate (mg/L) ") +
  theme(legend.position="none",
        panel.background = element_rect(fill='white', colour='white'), 
        panel.grid = element_line(color = NA),
        panel.grid.minor = element_line(color = NA),
        panel.border = element_rect(fill = NA, color = "black"),
        axis.text.x  = element_text(size=10, colour="black"),  
        axis.title.x = element_text(vjust=0.1),
        axis.text.y = element_text(size=12, colour="black"),
        axis.title.y = element_text(vjust=0.3))

enter image description here

我无法弄清楚如何使用stat_sum,以便将每个日期的5个点平均在一起。该程序将每个不同的阴影点(对应于湿地)视为其自己的平均值。我想保留湿地的分组并将其显示为不同的阴影,但也显示每个日期的所有y值的平均值。

数据

     Date  Wetland Hydrology  Nitrate 
1  17-Jun     One    Pulsed 0.2647287              
2  18-Jul     One    Pulsed 0.1807388             
3   1-Aug     One    Pulsed 0.9895910      
4  15-Aug     One    Pulsed 0.6566667       
5   7-Nov     One    Pulsed 0.2150000      
6  17-Jun     Two    Static 0.2134027      
7  18-Jul     Two    Static 0.1971669      
8   1-Aug     Two    Static 0.4774424       
9  15-Aug     Two    Static 0.3110000      
10  7-Nov     Two    Static 0.3333333       
11 17-Jun   Three    Pulsed 0.3369253       
12 18-Jul   Three    Pulsed 0.2056284       
13  1-Aug   Three    Pulsed 0.6731924       
14 15-Aug   Three    Pulsed 0.5516667       
15  7-Nov   Three    Pulsed 0.1853333      
16 17-Jun    Four    Static 0.3293668      
17 18-Jul    Four    Static 0.4664748       
18  1-Aug    Four    Static 0.4555003       
19 15-Aug    Four    Static 0.3993333       
20  7-Nov    Four    Static 0.1133333       
21 17-Jun    Five    Static 0.3497963     
22 18-Jul    Five    Static 0.3618659      
23  1-Aug    Five    Static 0.3721719     
24 15-Aug    Five    Static 0.2916667      
25  7-Nov    Five    Static 0.2526667      
26 17-Jun     Six    Pulsed 0.2779667       
27 18-Jul     Six    Pulsed 0.7609531      
28  1-Aug     Six    Pulsed 0.7177083       
29 15-Aug     Six    Pulsed 0.6610000       
30  7-Nov     Six    Pulsed 0.2083333       
31 17-Jun   Seven    Pulsed 0.2232040      
32 18-Jul   Seven    Pulsed 0.3655621       
33  1-Aug   Seven    Pulsed 0.7006131       
34 15-Aug   Seven    Pulsed 0.4753333      
35  7-Nov   Seven    Pulsed 0.3206667    
36 17-Jun   Eight    Static 0.3339319       
37 18-Jul   Eight    Static 0.3286641       
38  1-Aug   Eight    Static 0.4390918      
39 15-Aug   Eight    Static 0.3276667       
40  7-Nov   Eight    Static 0.2446667       
41 17-Jun    Nine    Static 0.3456627      
42 18-Jul    Nine    Static 0.2519814      
43  1-Aug    Nine    Static 0.3807550      
44 15-Aug    Nine    Static 0.3873333      
45  7-Nov    Nine    Static 0.1663333       
46 17-Jun     Ten    Pulsed 0.4135023      
47 18-Jul     Ten    Pulsed 0.1921382       
48  1-Aug     Ten    Pulsed 0.3898374       
49 15-Aug     Ten    Pulsed 0.2700000       
50  7-Nov     Ten    Pulsed 0.1216667       

dput(螺母)

structure(list(Date = structure(c(3L, 4L, 1L, 2L, 5L, 3L, 4L, 
1L, 2L, 5L, 3L, 4L, 1L, 2L, 5L, 3L, 4L, 1L, 2L, 5L, 3L, 4L, 1L, 
2L, 5L, 3L, 4L, 1L, 2L, 5L, 3L, 4L, 1L, 2L, 5L, 3L, 4L, 1L, 2L, 
5L, 3L, 4L, 1L, 2L, 5L, 3L, 4L, 1L, 2L, 5L), .Label = c("1-Aug", 
"15-Aug", "17-Jun", "18-Jul", "7-Nov"), class = "factor"), Wetland = structure(c(5L, 
5L, 5L, 5L, 5L, 10L, 10L, 10L, 10L, 10L, 9L, 9L, 9L, 9L, 9L, 
3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 7L, 7L, 7L, 7L, 7L, 6L, 
6L, 6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 8L, 8L, 
8L, 8L, 8L), .Label = c("Eight", "Five", "Four", "Nine", "One", 
"Seven", "Six", "Ten", "Three", "Two"), class = "factor"), Hydrology = structure(c(1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L), .Label = c("Pulsed", "Static"), class = "factor"), Nitrate = c(0.264728748, 
0.180738787, 0.989591021, 0.656666667, 0.215, 0.213402705, 0.197166881, 
0.477442378, 0.311, 0.333333333, 0.33692531, 0.205628403, 0.67319236, 
0.551666667, 0.185333333, 0.329366831, 0.466474791, 0.455500298, 
0.399333333, 0.113333333, 0.349796312, 0.361865927, 0.372171941, 
0.291666667, 0.252666667, 0.277966745, 0.760953065, 0.717708344, 
0.661, 0.208333333, 0.223203974, 0.365562124, 0.700613059, 0.475333333, 
0.320666667, 0.333931889, 0.328664129, 0.439091764, 0.327666667, 
0.244666667, 0.345662714, 0.251981433, 0.380755049, 0.387333333, 
0.166333333, 0.413502261, 0.192138209, 0.389837374, 0.27, 0.121666667
)), .Names = c("Date", "Wetland", "Hydrology", "Nitrate"), class = "data.frame", row.names = c(NA, 
-50L))

1 个答案:

答案 0 :(得分:4)

没有你的数据很难回答(因为我通常先测试它!),但我认为你可以通过日期和水文aggregate你的硝酸盐数据并计算平均值,然后使用额外的数据进行绘图geom_point具有不同的data.frame

这是否有效......

nutm <- aggregate( Nitrate ~ Date + Hydrology , data = nut , FUN = mean )

p <-ggplot(nut, aes(Date, Nitrate, shape = Hydrology)) +
  geom_point( data = nut , aes( fill = Wetland , group=Wetland ), colour="black", size=4)+
  scale_fill_manual(values=colvec) +
  scale_shape_manual(values=c(21,22))+
  facet_grid(. ~ Hydrology) +
  ylab ("Nitrate (mg/L) ") +
  geom_point( data = nutm , aes( x = Date , y = Nitrate) , color = "red" , fill = "red" , size = 4 ) +
  theme(legend.position="none",
        panel.background = element_rect(fill='white', colour='white'), 
        panel.grid = element_line(color = NA),
        panel.grid.minor = element_line(color = NA),
        panel.border = element_rect(fill = NA, color = "black"),
        axis.text.x  = element_text(size=10, colour="black"),  
        axis.title.x = element_text(vjust=0.1),
        axis.text.y = element_text(size=12, colour="black"),
        axis.title.y = element_text(vjust=0.3))

p

这为我的图表提供了每个日期的红色平均分数...... enter image description here