使用ggplot2将POSIXct形式的时间序列数据绘制为箱线图

时间:2015-10-19 17:36:24

标签: r ggplot2 boxplot

我有三个月的时间序列数据:

structure(list(timestamp = structure(c(1438367400, 1438453800, 
1438540200, 1438626600, 1438713000, 1438799400, 1438885800, 1438972200, 
1439058600, 1439145000, 1439231400, 1439317800, 1439404200, 1439490600, 
1439577000, 1439663400, 1439749800, 1439836200, 1439922600, 1440009000, 
1440095400, 1440181800, 1440268200, 1440354600, 1440441000, 1440527400, 
1440613800, 1440700200, 1440786600, 1440873000, 1440959400, 1441045800, 
1441132200, 1441218600, 1441305000, 1441391400, 1441477800, 1441564200, 
1441650600, 1441737000, 1441823400, 1441909800, 1441996200, 1442082600, 
1442169000, 1442255400, 1442341800, 1442428200, 1442514600, 1442601000, 
1442687400, 1442773800, 1442860200, 1442946600, 1443033000, 1443119400, 
1443205800, 1443292200, 1443378600, 1443465000, 1443551400, 1443637800, 
1443724200, 1443810600, 1443897000, 1443983400, 1444069800, 1444156200, 
1444242600, 1444329000, 1444415400, 1444501800, 1444588200, 1444674600, 
1444761000, 1444847400, 1444933800, 1445020200, 1445106600, 1445193000, 
1445279400, 1445365800, 1445452200, 1445538600, 1445625000, 1445711400, 
1445797800, 1445884200, 1445970600, 1446057000, 1446143400, 1446229800
), class = c("POSIXct", "POSIXt"), tzone = "Asia/Kolkata"), power = c(0.0247230720457733, 
0.0108825766825672, 0.0288152903005685, -0.0234123772463031, 
-0.00919754217224866, -0.0374403275019726, -0.0078777961550861, 
0.0079837844138577, 0.00575982953501377, 0.0202857581215497, 
0.0258757511850728, -0.0337682832455853, 0.0341912141620086, 
-0.0042798060711745, 0.00675858841374106, -0.00246125938541154, 
0.00912344869950909, 0.0206439842553702, -0.01940932272705, -0.0100830115181366, 
-0.00734857221326485, -0.0170222915573808, 0.0285240366773044, 
0.00764357383266427, -0.0238453527011212, -0.00672944465931788, 
0.00818204423502334, 0.00460125531472914, 0.0230428947908782, 
0.00154030144193071, 0.0189660255911821, 0.0155756949832998, 
0.0267515034089843, 0.0214963580786819, 0.0029481279523321, 0.00986682946599846, 
0.00832138782007834, -0.00676599971272203, -0.00324283490587677, 
0.0204023688477303, 0.0229717200642572, 0.0251381446922004, 0.00584010113018711, 
-0.00465215879879816, -0.0308844467014504, 0.000780093550060347, 
0.00369764046959574, -0.0160883189684658, 0.0218083597791737, 
-0.021605117962637, 0.000445192082904827, -0.00372433762871899, 
0.0106455591452724, 0.024611532476291, 0.0132632680432167, 0.0149559186037772, 
-0.0453599092776512, -0.0202099060937128, 0.0169712680315599, 
0.0148844950106621, -0.0391221225281138, 0.00461340547288957, 
0.0118982098114901, -0.00571305945781934, 0.0143190640584365, 
-0.0117202800880833, 0.0394635775820876, -0.00393737330560111, 
0.00633578511802405, 0.0402779799675971, -0.00146576620678839, 
-0.00974562394885507, 0.0179401329290733, -0.0157766103469759, 
0.00534190906349559, 0.0021055773020779, -0.018236876969857, 
-0.00392926841238278, -0.0065097462318096, -0.0249870099671041, 
-0.0139735459289521, 0.000625022444854668, -0.00278827413472659, 
0.0179048032598685, -0.0268735489312987, -0.00760559474288195, 
0.0179536496832603, -0.0126341858034209, -0.0338553507687797, 
-0.0045297254037134, -0.0106755306681489, -0.0154881466662607
)), .Names = c("timestamp", "power"), row.names = c(NA, -92L), class = "data.frame")

给定数据框的结构是:

> str(h2)
'data.frame':   92 obs. of  2 variables:
 $ timestamp: POSIXct, format: "2015-08-01" "2015-08-02" "2015-08-03" "2015-08-04" ...
 $ power    : num  0.0247 0.0109 0.0288 -0.0234 -0.0092 ...

现在,我想使用ggplot2以箱形图的形式绘制这些数据。我相信会有三个带有胡须的盒子对应三个月。直到现在,我正在使用以下代码:

ggplot(dats) +geom_boxplot(aes(x=timestamp,y=power,group=timestamp),width = 0.5) +
      theme(axis.text.x = element_text(angle=90,hjust=1)) +
   scale_x_datetime(labels = date_format("%b "), breaks="1 month")

我认为我无法按月对数据进行正确分组。如果我是正确的,我应该如何按月分组数据,以便绘制箱图?

注意:我已经为这个问题提供了一个答案,但在答案中我将时间戳从POSIXct转换为factor。但是,我想使用给定的POSIXct时间戳进行绘图,因为转换过程会消耗一些额外的时间。

1 个答案:

答案 0 :(得分:0)

注意:此答案中使用的数据集与问题中指定的数据集完全不同

在将POSIXct表单转换为factor时,可以使用以下脚本:

    #First convert timestamp to months format and use levels so that final output 
    #is presented in chronological order and not in alphabetical order
    dats$timestamp <- factor(strftime(dats$timestamp,"%b"),levels = month.abb)
    ggplot(dats) +geom_boxplot(aes(x=timestamp,y=power))    +
      theme(axis.text.x = element_text(angle=90,hjust=1)) + xlab("Time Period") +ylab("Power")  

最终的情节是: enter image description here更新 没有时间转换,即直接基于POSIXct时间戳,我做了

ggplot(dats) +geom_boxplot(aes(x=timestamp,y=power, group=months(timestamp)))+
    theme(axis.text.x = element_text(angle=90,hjust=1)) + xlab("Time Period") +ylab("Power") +
    scale_x_datetime(breaks="1 month", labels = date_format("%b"))

我希望使用POSIXct格式使用相同类型的绘图。