I am new to this forum, so excuse me if I don't get my question right at the first start. I have researched in the forum to find an answer to my problem but haven't found a proper solution yet.
I am trying to compare two time series using linear regression and scatterplots. The time-series should have measurements every 2 minutes but as it is in real-life sometimes the datalogger doesn't write values at all and sometimes only after 3 minutes. So I am trying to find all pairs (x,y) that have the same time stamp and eliminate the rest.
Time x
1 2016-08-15 09:58:00 2.7421
2 2016-08-15 10:02:00 2.7731
3 2016-08-15 10:04:00 2.7603
4 2016-08-15 10:06:00 2.7426
5 2016-08-15 10:08:00 2.7481
6 2016-08-15 10:10:00 2.7294
7 2016-08-15 10:12:00 2.7428
8 2016-08-15 10:15:00 2.7371
9 2016-08-15 10:16:00 2.7677
10 2016-08-15 10:18:00 2.7449
Time y
1 2016-08-15 10:00:00 1.3656
2 2016-08-15 10:02:00 1.3680
3 2016-08-15 10:04:00 1.3785
4 2016-08-15 10:06:00 1.3819
5 2016-08-15 10:08:00 1.3720
6 2016-08-15 10:10:00 1.3702
7 2016-08-15 10:12:00 1.3550
8 2016-08-15 10:14:00 1.3696
9 2016-08-15 10:16:00 1.3603
10 2016-08-15 10:18:00 1.3813
In this example values for 1 and 8 should be eliminated.
答案 0 :(得分:0)
> library(lubridate)
Attaching package: ‘lubridate’
The following object is masked from ‘package:base’:
date
> df1$Time=mdy_hm(paste(df1$Time))
> df1
Time x
1 2016-08-15 09:58:00 2.7421
2 2016-08-15 10:02:00 2.7731
3 2016-08-15 10:04:00 2.7603
4 2016-08-15 10:06:00 2.7426
5 2016-08-15 10:08:00 2.7481
6 2016-08-15 10:10:00 2.7294
7 2016-08-15 10:12:00 2.7428
8 2016-08-15 10:15:00 2.7371
9 2016-08-15 10:16:00 2.7677
10 2016-08-15 10:18:00 2.7449
> df2$Time=mdy_hm(paste(df2$Time))
> df2
Time y
1 2016-08-15 10:00:00 1.3656
2 2016-08-15 10:02:00 1.3680
3 2016-08-15 10:04:00 1.3785
4 2016-08-15 10:06:00 1.3819
5 2016-08-15 10:08:00 1.3720
6 2016-08-15 10:10:00 1.3702
7 2016-08-15 10:12:00 1.3550
8 2016-08-15 10:14:00 1.3696
9 2016-08-15 10:16:00 1.3603
10 2016-08-15 10:18:00 1.3813
> merge(df1,df2,by="Time")
Time x y
1 2016-08-15 10:02:00 2.7731 1.3680
2 2016-08-15 10:04:00 2.7603 1.3785
3 2016-08-15 10:06:00 2.7426 1.3819
4 2016-08-15 10:08:00 2.7481 1.3720
5 2016-08-15 10:10:00 2.7294 1.3702
6 2016-08-15 10:12:00 2.7428 1.3550
7 2016-08-15 10:16:00 2.7677 1.3603
8 2016-08-15 10:18:00 2.7449 1.3813
答案 1 :(得分:0)
使用散点图:
ndf = merge(df1, df2, by = "Time", all = FALSE)
p = ggplot(ndf, aes(x, y)) +
geom_point(colour = "red", size = 2)
p