根据日期进行计算

时间:2018-07-20 12:36:33

标签: r date

我有一个实际观测数据集df1:

Date        Month     Year  Actual
02/12/2017  December    17  4623
12/12/2017  December    17  5111
22/12/2017  December    17  4800
22/12/2017  December    17  4769
02/01/2018  January     18  4711
03/01/2018  January     18  4503
04/01/2018  January     18  4650
05/01/2018  January     18  4598
06/02/2018  February    18  4612
07/02/2018  February    18  4493
08/02/2018  February    18  4515
09/02/2018  February    18  4469

然后是我的每月预测df2:

Month     Year  Prediction
December  17    4874
January   18    4626
February  18    4576

如何根据每月和每年的实际价值减去预测?所以我会收到以下错误:

Error
-251
237
-74
-105
85
-123
24
-28
36
-83
-61
-107

2 个答案:

答案 0 :(得分:1)

请注意,由于年份与2018年实际值不匹配,因此问题中的结果是错误的。

1)基本R 左联接数据框并执行减法:

transform(merge(act, pred, all.x = TRUE, sort = FALSE), Diff = Prediction - Actual)

给予:

      Month Year       Date Actual Prediction Diff
1  December   17 02/12/2017   4623       4874  251
2  December   17 12/12/2017   5111       4874 -237
3  December   17 22/12/2017   4800       4874   74
4  December   17 22/12/2017   4769       4874  105
5   January   18 02/01/2018   4711         NA   NA
6   January   18 03/01/2018   4503         NA   NA
7   January   18 04/01/2018   4650         NA   NA
8   January   18 05/01/2018   4598         NA   NA
9  February   18 06/02/2018   4612         NA   NA
10 February   18 07/02/2018   4493         NA   NA
11 February   18 08/02/2018   4515         NA   NA
12 February   18 09/02/2018   4469         NA   NA

sqldf

library(sqldf)
sqldf("select *, Prediction - Actual as Diff  
       from act left join pred using(Year, Month)")

给予:

         Date    Month Year Actual Prediction Diff
1  02/12/2017 December   17   4623       4874  251
2  12/12/2017 December   17   5111       4874 -237
3  22/12/2017 December   17   4800       4874   74
4  22/12/2017 December   17   4769       4874  105
5  02/01/2018  January   18   4711         NA   NA
6  03/01/2018  January   18   4503         NA   NA
7  04/01/2018  January   18   4650         NA   NA
8  05/01/2018  January   18   4598         NA   NA
9  06/02/2018 February   18   4612         NA   NA
10 07/02/2018 February   18   4493         NA   NA
11 08/02/2018 February   18   4515         NA   NA
12 09/02/2018 February   18   4469         NA   NA

注意

可重复输入的形式是:

Lines1 <- "
Date        Month     Year  Actual
02/12/2017  December    17  4623
12/12/2017  December    17  5111
22/12/2017  December    17  4800
22/12/2017  December    17  4769
02/01/2018  January     18  4711
03/01/2018  January     18  4503
04/01/2018  January     18  4650
05/01/2018  January     18  4598
06/02/2018  February    18  4612
07/02/2018  February    18  4493
08/02/2018  February    18  4515
09/02/2018  February    18  4469"
act <- read.table(text = Lines1, header = TRUE, as.is = TRUE)

Lines2 <- "
Month     Year  Prediction
December  17    4874
January   17    4626
February  17    4576"
pred <- read.table(text = Lines2, header = TRUE, as.is = TRUE)

答案 1 :(得分:0)

再次使用G Grothendieck的数据,包括对一月和二月的年份的评论,您可以使用dplyr进行此操作:

library(dplyr)
act %>% 
  full_join(pred) %>% 
  mutate(Error = Actual - Prediction) %>% 
  select(-Prediction)



         Date    Month Year Actual Error
1  02/12/2017 December   17   4623  -251
2  12/12/2017 December   17   5111   237
3  22/12/2017 December   17   4800   -74
4  22/12/2017 December   17   4769  -105
5  02/01/2018  January   18   4711    85
6  03/01/2018  January   18   4503  -123
7  04/01/2018  January   18   4650    24
8  05/01/2018  January   18   4598   -28
9  06/02/2018 February   18   4612    36
10 07/02/2018 February   18   4493   -83
11 08/02/2018 February   18   4515   -61
12 09/02/2018 February   18   4469  -107

数据:

Lines1 <- "
Date        Month     Year  Actual
02/12/2017  December    17  4623
12/12/2017  December    17  5111
22/12/2017  December    17  4800
22/12/2017  December    17  4769
02/01/2018  January     18  4711
03/01/2018  January     18  4503
04/01/2018  January     18  4650
05/01/2018  January     18  4598
06/02/2018  February    18  4612
07/02/2018  February    18  4493
08/02/2018  February    18  4515
09/02/2018  February    18  4469"
act <- read.table(text = Lines1, header = TRUE, as.is = TRUE)

Lines2 <- "
Month     Year  Prediction
December  17    4874
January   18    4626
February  18    4576"
pred <- read.table(text = Lines2, header = TRUE, as.is = TRUE)