从R中的某个日期开始减去年份

时间:2013-08-16 07:51:02

标签: r date

我想要绘制3年以上的数据。

但是我想每年并排绘制。

为了做到这一点,我想让03/17/2010的日期变为03/17,以便与03/17/2011对齐。

如何在R中做到这一点?

这是我希望它看起来像的图像: enter image description here

4 个答案:

答案 0 :(得分:4)

R有自己的Date表示,您应该使用它。将数据转换为Date后,可以使用format函数轻松操作其格式。

http://www.statmethods.net/input/dates.html

作为例子

> d <- as.Date( "2010-03-17" )
> d
[1] "2010-03-17"
> format( d, format="%m/%d")
[1] "03/17"

或使用您的数据样式

> format( as.Date("03/17/2010", "%m/%d/%Y"), format="%m/%d")
[1] "03/17"

答案 1 :(得分:1)

您可以使用R的内置样式作为日期,使用as.Date()format仅选择月份和日期:

> dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
> format(as.Date(dates, "%m/%d/%y"), "%m/%d")
[1] "02/27" "02/27" "01/14" "02/28" "02/01"

对于您的示例,只需使用您自己的日期。

我使用R的帮助发现了这一点,前一个例子是:

> ?as.Date
> ?format 

答案 2 :(得分:0)

这是我的解决方案:

它涉及将日期格式化为字符串(没有年份),然后返回日期,这将默认所有日期为相同(当前年份)。

代码和示例输入文件如下:

代码

# Clear all
rm(list = ls())

# Load the library that reads xls files
library(gdata)

# Get the data in
data = read.csv('Readings.csv')

# Extract each Column
readings = data[,"Reading"]
dates = as.Date(data[,"Reading.Date"])

# Order the data correctly
readings = readings[order(dates)]
dates = dates[order(dates)]

# Calculate the difference between each date (in days) and readings
diff.readings = diff(readings)
diff.dates = as.numeric(diff(dates)) # Convert from days to an integer

# Calculate the usage per reading period
usage.per.period = diff.readings/diff.dates

# Get Every single day between the very first reading and the very last
# seq will create a sequence: first argument is min, second is max, and 3rd is the step size (which in this case is 1 day)
days = seq(min(dates),max(dates), 1)
# This creates an empty vector to get data from the for loop below
usage.per.day = numeric()

# The length of the diff.dates is the number of periods that exist.
for (period in 1:(length(diff.dates))){
    # to convert usage.per.period to usage.per.day, we need to replicate the 
    # value for the number of days in that period. the function rep will 
    # replicate a number: first argument is the number to replicate, and the 
    # second number is the number of times to replicate it. the function c will 
    # concatinate the current vector and the new period, sort of 
    # like value = value + 6, but with vectors. 
    usage.per.day = c(usage.per.day, rep(usage.per.period[period], diff.dates[period]))
}
# The for loop above misses out on the last day, so I add that single value manually
usage.per.day[length(usage.per.day)+1] = usage.per.period[period]

# Get the number of readings for each year
years = names(table(format(dates, "%Y")))

# Now break down the usages and the days by year
# list() creates an empty list
usage.per.day.grouped.by.year = list()
year.day = list()
# This defines some colors for plotting, rainbow(n) will give you 
colors = rainbow(length(years))
for (year.index in 1:length(years)){
    # This is a vector of trues and falses, to say whether a day is in a particular
    # year or not
    this.year = (days >= as.Date(paste(years[year.index],'/01/01',sep="")) &
                 days <= as.Date(paste(years[year.index],'/12/31',sep="")))
    usage.per.day.grouped.by.year[[year.index]] = usage.per.day[this.year]
    # We only care about the month and day, so drop the year
    year.day[[year.index]] = as.Date(format(days[this.year], format="%m/%d"),"%m/%d")
    # In the first year, we need to set up the whole plot
    if (year.index == 1){
        # create a png file with file name image.png
        png('image.png')
        plot(year.day[[year.index]], # x coords
             usage.per.day.grouped.by.year[[year.index]], # y coords
             "l", # as a line
             col=colors[year.index], # with this color
             ylim = c(min(usage.per.day),max(usage.per.day)), # this y max and y min
             ylab='Usage', # with this lable for y axis
             xlab='Date', # with this lable for x axis
             main='Usage Over Time') # and this title
    }
    else {
        # After the plot is set up, we just need to add each year
        lines(year.day[[year.index]], # x coords
            usage.per.day.grouped.by.year[[year.index]], # y coords
            col=colors[year.index]) # color
    }
}
# add a legend to the whole thing
legend("topright" , # where to put the legend
    legend = years, # what the legend names are
    lty=c(1,1), # what the symbol should look like
    lwd=c(2.5,2.5), # what the symbol should look like
    col=colors) # the colors to use for the symbols
dev.off() # save the png to file

输入文件

Reading Date,Reading
1/1/10,10
2/1/10,20
3/6/10,30
4/1/10,40
5/7/10,50
6/1/10,60
7/1/10,70
8/1/10,75
9/22/10,80
10/1/10,85
11/1/10,90
12/1/10,95
1/1/11,100
2/1/11,112.9545455
3/1/11,120.1398601
4/1/11,127.3251748
5/1/11,134.5104895
6/1/11,141.6958042
7/1/11,148.8811189
8/1/11,156.0664336
9/17/11,190
10/1/11,223.9335664
11/1/11,257.8671329
12/1/11,291.8006993
1/1/12,325.7342657
2/1/12,359.6678322
3/5/12,375
4/1/12,380
5/1/12,385
6/1/12,390
7/1/12,400
8/1/12,410
9/1/12,420

答案 3 :(得分:0)

seasonplot()做得非常好!