Question

我有一个温度数据名称=“ dlr_rms”，其中包含1208783条目和这些条目的日期时间，我必须找到该数据的绝对错误，基本上我还有另一个表名为“绝对值”的表，该表具有21072个温度值。我想根据共享或分组的日期时间从以前的数据中减去这些值。

例如（df1）：

temp  date-time                     
2     2015-07-14 16:44:01      
3     2015-07-14 16:44:01  
4     2016-08-14 16:44:02
8     2016-08-14 16:44:02
5     2017-09-14 16:44:03    
6     2017-09-14 16:44:03

df2：

absolute table    date-time
2                 2015-07-14 16:44:01
5                 2016-08-14 16:44:02 
9                 2017-09-14 16:44:03

我想从绝对值表中分配给它们的相同数字中减去共享日期时间值，例如（2,3）（4,8）（5,6）。我还需要连接不同的表以实现错误计算

所需结果表

2-2=0
3-2=1

4-5= -1
8-5= 3

5-9 =-4
6-9 = -3

dput命令的输出：df1：

1515434400、1515438000、1515452400、1515456000、1515459600、1515463200， 1515466800、1515470400、1515474000、1515477600、1515481200），类别= c（“ POSIXct”， “ POSIXt”），tzone =“ UTC”）），class =“ data.frame”，row.names = c（NA， -21072L））

Answer 1

我们可以使用dplyr和tidyr来做到这一点：

library(dplyr)
library(tidyr)

df1 %>%
  left_join(df2, by = "date_time") %>%
  mutate(absolute_error = temp-absolute)

结果：

  temp           date_time absolute absolute_error
1    2 2015-07-14 16:44:01        2              0
2    3 2015-07-14 16:44:01        2              1
3    4 2016-08-14 16:44:02        5             -1
4    8 2016-08-14 16:44:02        5              3
5    5 2017-09-14 16:44:03        9             -4
6    6 2017-09-14 16:44:03        9             -3

数据：

df1 = structure(list(temp = c(2L, 3L, 4L, 8L, 5L, 6L), date_time = structure(c(1L, 
1L, 2L, 2L, 3L, 3L), .Label = c("2015-07-14 16:44:01", "2016-08-14 16:44:02", 
"2017-09-14 16:44:03"), class = "factor")), .Names = c("temp", 
"date_time"), class = "data.frame", row.names = c(NA, -6L))

df2 = structure(list(absolute = c(2L, 5L, 9L), date_time = structure(1:3, .Label = c("2015-07-14 16:44:01", 
"2016-08-14 16:44:02", "2017-09-14 16:44:03"), class = "factor")), .Names = c("absolute", 
"date_time"), class = "data.frame", row.names = c(NA, -3L))

根据其他data.frame过滤并加入data.frame

1 个答案: