使用geom_bar作为因子变量NA的数据帧

时间:2018-04-01 19:53:13

标签: r dataframe ggplot2 geom-bar

我正在尝试在顶部轴上绘制日期明智的多变量数据以及一个独立变量。 为此 - 我将多变量(响应变量)数据帧与单个输入(自变量)合并为一个数据帧。 结果数据帧现在在行和列中有几个NA值(对于两个数据集)。

我的问题:

  • 为什么我用当前代码丢失宽度/闪避?
  • 这是否与我数据中的因式变量中的NA值有关?
  • 如何使用因子变量中的NA值? 我的一半数据集是一个完全不同的变量,只需要一列。我合并它们的唯一原因是因为我想将所有数据都放在同一个图上(计划是在此之后使用grob,但我被困在这里)

在此之前,我正在使用此代码将geom_bar与数据帧一起用响应变量绘制,并且它有效。

Previuosly plotted geom_bar (this is how I expect it to look)

Geom_bar plot with the merged dataframe and same code

数据框名称为final,因子变量为TYPE,其中Open,Shrub和Lowland为类别,NA为仅具有自变量的日期(在本例中为Rain)

 final$TYPE<-factor(final$TYPE, levels = c("Open", "Shrub","Lowland"))      

 limits <- aes(ymax = final$Max, ymin = final$Min, ysd= final$SD)
 rhg_cols <- c("brown","forestgreen", "cyan4")


 p <- ggplot(final, aes(Date, MeanTWC, fill=TYPE), na.rm=F )+
   geom_bar(stat="identity", position = "dodge")+
   scale_fill_manual(values = rhg_cols)+
   scale_x_date(breaks = seq(as.Date("2016-08-15"), as.Date("2017-10-15"), by="30 days"),labels=date_format("%b-%Y")) 


 p<-p + labs(x="DATE", y ="Total Water in mm")


 p<-p + geom_bar(stat = "Identity",
                 position = "dodge")+
   geom_errorbar(limits, position = "dodge", size =0.2)+ 
   ggtitle("Total Water Storage-60cm")+
   scale_y_continuous(limits = c(0,100))

p<-p+theme_bw() +theme(axis.text.x = element_text(angle = 270, vjust = 1, 

size =15),axis.text.y = element_text(vjust = 1, hjust = 1, size =20),
                            panel.grid.major.x = element_blank(),
                            panel.grid.minor.x = element_line(linetype="longdash"),
                            panel.grid.major.y = element_line(linetype = "longdash"))
     print(p)

示例数据:

          Date    TYPE   MeanTWC       Max       Min   Rain
1   2016-08-13    <NA>        NA        NA        NA 27.686
2   2016-08-14    <NA>        NA        NA        NA 79.248
3   2016-08-15    <NA>        NA        NA        NA  9.398
4   2016-08-16    <NA>        NA        NA        NA  9.906
5   2016-08-17    <NA>        NA        NA        NA 26.670
6   2016-08-21    <NA>        NA        NA        NA 52.324
7   2016-08-27    <NA>        NA        NA        NA 13.200
8   2016-08-28    <NA>        NA        NA        NA  0.200
9   2016-08-29    <NA>        NA        NA        NA  3.000
10  2016-08-30    <NA>        NA        NA        NA  0.400
11  2016-09-02    <NA>        NA        NA        NA  5.400
12  2016-09-04    <NA>        NA        NA        NA 22.200
13  2016-09-05    <NA>        NA        NA        NA  0.400
14  2016-09-06    <NA>        NA        NA        NA  0.400
15  2016-09-11    <NA>        NA        NA        NA  0.200
16  2016-09-19    Open  82.40583  94.13074  71.95022     NA
17  2016-09-19   Shrub  75.25720  81.09062  66.31633     NA
18  2016-09-19 Lowland  79.78265  91.46637  71.42791     NA
19  2016-09-24    <NA>        NA        NA        NA  1.200
20  2016-09-28    Open 107.00762 128.82301  87.78908     NA
21  2016-09-28   Shrub 102.29717 114.59530  93.02085     NA
22  2016-09-28 Lowland 100.62097 108.65464  93.06479     NA
23  2016-10-04    Open  94.35146 119.11809  80.80844     NA
24  2016-10-04   Shrub  89.78960 106.59891  77.91514     NA
25  2016-10-04 Lowland  87.66499  98.93036  77.44905     NA
26  2016-10-07    <NA>        NA        NA        NA 15.200
27  2016-10-24    Open  77.75282  90.99799  60.89542     NA
28  2016-10-24   Shrub  73.13549  84.68082  64.38086     NA
29  2016-10-24 Lowland  77.54505  89.20983  68.77503     NA
30  2016-11-04    Open  75.79262  84.63392  61.17391     NA

 structure(list(Date = structure(c(17026, 17027, 17028, 17029, 
    17030, 17034, 17040, 17041, 17042, 17043, 17046, 17048, 17049, 
    17050, 17055, 17063, 17063, 17063, 17068, 17072, 17072, 17072, 
    17078, 17078, 17078, 17081, 17098, 17098, 17098, 17109), class = "Date"), 
        TYPE = structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, 1L, 2L, 3L, NA, 1L, 2L, 3L, 1L, 2L, 3L, 
        NA, 1L, 2L, 3L, 1L), .Label = c("Open", "Shrub", "Lowland"
        ), class = "factor"), MeanTWC = c(NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, 82.4058263935714, 75.2571964744444, 
        79.782649985, NA, 107.0076241875, 102.297170442857, 100.620970785, 
        94.3514631776471, 89.7895999577778, 87.664985085, NA, 77.75281636125, 
        73.135492118, 77.54505326, 75.792624628125), Max = c(NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 94.13073642, 
        81.09062269, 91.46637475, NA, 128.8230145, 114.5952995, 108.6546353, 
        119.1180866, 106.5989092, 98.93036216, NA, 90.99798892, 84.68081807, 
        89.20983383, 84.63391564), Min = c(NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, 71.95021894, 66.31632641, 
        71.42791015, NA, 87.78907749, 93.02084587, 93.06478569, 80.8084363, 
        77.91514274, 77.44904985, NA, 60.89542395, 64.38086067, 68.77503196, 
        61.17390712), SD = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, 6.52668534466645, 5.31370742998586, 
        8.40565594980702, NA, 10.3287869191442, 8.45785409063748, 
        6.49446280465913, 9.73718805734734, 10.5575933779477, 9.35169762923353, 
        NA, 8.27219492616507, 6.75450870627616, 8.51146778459709, 
        6.75447037137946), N = c(NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, 14, 9, 4, NA, 12, 7, 4, 17, 9, 
        4, NA, 16, 10, 4, 16), SE = c(NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, 1.74433003078718, 1.77123580999529, 
        4.20282797490351, NA, 2.9816639540851, 3.19676836415673, 
        3.24723140232956, 2.36161499158505, 3.51919779264923, 4.67584881461676, 
        NA, 2.06804873154127, 2.13596319872699, 4.25573389229854, 
        1.68861759284486), Rain = c(27.686, 79.248, 9.398, 9.906, 
        26.67, 52.324, 13.2, 0.2, 3, 0.4, 5.4, 22.2, 0.4, 0.4, 0.2, 
        NA, NA, NA, 1.2, NA, NA, NA, NA, NA, NA, 15.2, NA, NA, NA, 
        NA)), .Names = c("Date", "TYPE", "MeanTWC", "Max", "Min", 
    "SD", "N", "SE", "Rain"), row.names = c(NA, 30L), class = "data.frame")

0 个答案:

没有答案