我有一个数据帧,我需要计算Y的缩放值,我想使用glmnet或xgboost进行预测,而我将需要对每个组的结果进行缩放。
df <- data.frame(group = rep(c("gr1", "gr2"), each = 10), y = rnorm(20,2,8))
df <- df %>% group_by(group) %>%
mutate(scaled_y = scale(y))
这是重构y的一种方法吗?
答案 0 :(得分:1)
您是说要将z分数转换回原始值吗?
library(dplyr)
set.seed(5)
df <- data.frame(group = rep(c("gr1", "gr2"), each = 10), y = rnorm(20,2,8))
df %>%
group_by(group) %>%
mutate(scaled_y = scale(y),
y_raw = mean(y) + (scaled_y * sd(y)))
# A tibble: 20 x 4
# Groups: group [2]
group y scaled_y y_raw
<fct> <dbl> <dbl> <dbl>
1 gr1 -4.73 -0.800 -4.73
2 gr1 13.1 1.54 13.1
3 gr1 -8.04 -1.24 -8.04
4 gr1 2.56 0.156 2.56
5 gr1 15.7 1.88 15.7
6 gr1 -2.82 -0.550 -2.82
7 gr1 -1.78 -0.413 -1.78
8 gr1 -3.08 -0.584 -3.08
9 gr1 -0.286 -0.217 -0.286
10 gr1 3.10 0.228 3.10
11 gr2 11.8 1.88 11.8
12 gr2 -4.41 -0.352 -4.41
13 gr2 -6.64 -0.658 -6.64
14 gr2 0.740 0.357 0.740
15 gr2 -6.57 -0.649 -6.57
16 gr2 0.888 0.378 0.888
17 gr2 -2.78 -0.127 -2.78
18 gr2 -15.5 -1.87 -15.5
19 gr2 3.93 0.796 3.93
20 gr2 -0.0748 0.245 -0.0748