我有一组数据(采用csv格式),类似于:
Date Auto_Index Realty_Index
29-Dec-02 1742.2 1000
2-Jan-03 1748.85 1009.67
3-Jan-03 1758.66 1041.45
4-Jan-03 1802.9 1062.11
5-Jan-03 1797.45 1047.56
...
...
...
26-Nov-12 1665.5 248.75
27-Nov-12 1676.3 257.6
29-Nov-12 1696.7 266.9
30-Nov-12 1682.8 266.55
3-Dec-12 1702.6 270.4
我想在R
的不同时段分析这些数据。有没有办法可以将这些数据分解为2002-2005
,2006-2009
和2009-2012
等不同时期?
答案 0 :(得分:2)
如同@ user1317221_G一样,您应该使用函数cut.POSIXt
。方法如下:
d
Date Auto_Index Realty_Index
1 29-Dec-02 1742.20 1000.00
2 2-Jan-03 1748.85 1009.67
3 3-Jan-03 1758.66 1041.45
4 4-Jan-03 1802.90 1062.11
5 5-Jan-03 1797.45 1047.56
6 26-Nov-12 1665.50 248.75
7 27-Nov-12 1676.30 257.60
8 29-Nov-12 1696.70 266.90
9 30-Nov-12 1682.80 266.55
10 3-Dec-12 1702.60 270.40
# First step, convert your date column in POSIXct using strptime
d$Date <- strptime(d$Date, format("%d-%b-%y"))
# Then define your break points for your periods:
breaks <- as.POSIXct(c("2002-01-01","2006-01-01","2010-01-01","2013-01-01"))
# Then cut
d$Period <- cut(d$Date, breaks=breaks,
labels=c("2002-2005","2006-2009","2010-2012"))
d
Date Auto_Index Realty_Index Period
1 2002-12-29 1742.20 1000.00 2002-2005
2 2003-01-02 1748.85 1009.67 2002-2005
3 2003-01-03 1758.66 1041.45 2002-2005
4 2003-01-04 1802.90 1062.11 2002-2005
5 2003-01-05 1797.45 1047.56 2002-2005
6 2012-11-26 1665.50 248.75 2010-2012
7 2012-11-27 1676.30 257.60 2010-2012
8 2012-11-29 1696.70 266.90 2010-2012
9 2012-11-30 1682.80 266.55 2010-2012
10 2012-12-03 1702.60 270.40 2010-2012
答案 1 :(得分:2)
如果您希望将句点作为数字(而不是文本)进行操作,那么这可能会有所帮助:
br <- c("2002","2005","2010","2013")
df$Int <-findInterval(format(as.Date(df$Date,format='%d-%b-%y'),"%Y"),br)