我在R中有这样一个数据帧:
set.seed(10)
sample <- data_frame(Date = c('2000-05-01','2000-05-02','2000-05-03','2000-05-04',"2000-05-05",'2000-05-06'),
T1 = rnorm(6),
T2 = rnorm(6),
T3 = rnorm(6),
T1_a = rnorm(6),
T1_b = rnorm(6),
T1_c = rnorm(6),
T2_a = rnorm(6),
T2_b = rnorm(6),
T2_c = rnorm(6),
T3_a = rnorm(6),
T3_b = rnorm(6),
T3_c = rnorm(6))
我想使用MLmetrics
包来计算均方根误差:
library(MLmetrics)
RMSE_T1_a = RMSE(sample$T1, sample$T1_a)
RMSE_T1_b = RMSE(sample$T1, sample$T1_b)
RMSE_T1_c = RMSE(sample$T1, sample$T1_c)
RMSE_T2_a = RMSE(sample$T2, sample$T2_a)
RMSE_T2_b = RMSE(sample$T2, sample$T2_b)
RMSE_T2_c = RMSE(sample$T2, sample$T2_c)
RMSE_T3_a = RMSE(sample$T3, sample$T3_a)
RMSE_T3_b = RMSE(sample$T3, sample$T3_b)
RMSE_T3_c = RMSE(sample$T3, sample$T3_c)
最后,我要将所有这些RMSE放在一个数据框中。
是否有更快的方式一次完成所有这些操作?
答案 0 :(得分:1)
一种方法是像这样使用dplyr
包:
library(dplyr)
rmsedata <- sample %>%
summarise_at(vars(matches("T1_")), ~RMSE(T1, .x)) %>%
bind_cols(sample %>%summarise_at(vars(matches("T2_")), ~RMSE(T2, .x))) %>%
bind_cols(sample %>%summarise_at(vars(matches("T3_")), ~RMSE(T3, .x)))
T1_a T1_b T1_c T2_a T2_b T2_c T3_a T3_b T3_c
1 1.391521 0.6828504 1.61983 1.195112 0.8101942 0.8953161 0.7983381 1.396028 1.171313
一个小建议:不要命名您的数据样本:有一个名为sample
的R函数,同时命名您的数据样本也会引起混乱:)