Scatterplot with time series of different length in R

时间:2017-06-15 10:25:07

标签: r datetime

I am new to this forum, so excuse me if I don't get my question right at the first start. I have researched in the forum to find an answer to my problem but haven't found a proper solution yet.
I am trying to compare two time series using linear regression and scatterplots. The time-series should have measurements every 2 minutes but as it is in real-life sometimes the datalogger doesn't write values at all and sometimes only after 3 minutes. So I am trying to find all pairs (x,y) that have the same time stamp and eliminate the rest.

             Time           x
1    2016-08-15 09:58:00  2.7421  
2    2016-08-15 10:02:00  2.7731  
3    2016-08-15 10:04:00  2.7603  
4    2016-08-15 10:06:00  2.7426  
5    2016-08-15 10:08:00  2.7481  
6    2016-08-15 10:10:00  2.7294  
7    2016-08-15 10:12:00  2.7428  
8    2016-08-15 10:15:00  2.7371  
9    2016-08-15 10:16:00  2.7677  
10   2016-08-15 10:18:00  2.7449 



           Time            y
1    2016-08-15 10:00:00  1.3656  
2    2016-08-15 10:02:00  1.3680  
3    2016-08-15 10:04:00  1.3785  
4    2016-08-15 10:06:00  1.3819  
5    2016-08-15 10:08:00  1.3720  
6    2016-08-15 10:10:00  1.3702  
7    2016-08-15 10:12:00  1.3550  
8    2016-08-15 10:14:00  1.3696  
9    2016-08-15 10:16:00  1.3603  
10   2016-08-15 10:18:00  1.3813  

In this example values for 1 and 8 should be eliminated.

2 个答案:

答案 0 :(得分:0)

    > library(lubridate)

Attaching package: ‘lubridate’

The following object is masked from ‘package:base’:

    date


> df1$Time=mdy_hm(paste(df1$Time))
> df1
                  Time      x
1  2016-08-15 09:58:00 2.7421
2  2016-08-15 10:02:00 2.7731
3  2016-08-15 10:04:00 2.7603
4  2016-08-15 10:06:00 2.7426
5  2016-08-15 10:08:00 2.7481
6  2016-08-15 10:10:00 2.7294
7  2016-08-15 10:12:00 2.7428
8  2016-08-15 10:15:00 2.7371
9  2016-08-15 10:16:00 2.7677
10 2016-08-15 10:18:00 2.7449

> df2$Time=mdy_hm(paste(df2$Time))
> df2
                  Time      y
1  2016-08-15 10:00:00 1.3656
2  2016-08-15 10:02:00 1.3680
3  2016-08-15 10:04:00 1.3785
4  2016-08-15 10:06:00 1.3819
5  2016-08-15 10:08:00 1.3720
6  2016-08-15 10:10:00 1.3702
7  2016-08-15 10:12:00 1.3550
8  2016-08-15 10:14:00 1.3696
9  2016-08-15 10:16:00 1.3603
10 2016-08-15 10:18:00 1.3813


> merge(df1,df2,by="Time")
                     Time      x      y
    1 2016-08-15 10:02:00 2.7731 1.3680
    2 2016-08-15 10:04:00 2.7603 1.3785
    3 2016-08-15 10:06:00 2.7426 1.3819
    4 2016-08-15 10:08:00 2.7481 1.3720
    5 2016-08-15 10:10:00 2.7294 1.3702
    6 2016-08-15 10:12:00 2.7428 1.3550
    7 2016-08-15 10:16:00 2.7677 1.3603
    8 2016-08-15 10:18:00 2.7449 1.3813

答案 1 :(得分:0)

使用散点图:

ndf = merge(df1, df2, by = "Time", all = FALSE)

p = ggplot(ndf, aes(x, y)) +
  geom_point(colour = "red", size = 2)

p