我将数据集中的因子转换为数字,如下所示
library(dplyr)
df = data.frame(level= c( 'low', 'medium', 'high', 'very high'))
df$level = as.numeric(revalue(df$level, c('low' = 1, 'medium' =2, 'high'= 3, 'very high'=4)))
df
没关系。 当我尝试将此规则应用于新数据集时出现问题(我对模型&并且想要预测新数据)
newdude = data.frame(level = c( 'high'))
newdude$level = as.numeric(revalue(newdude$level, c('low' = 1, 'medium' =2, 'high'= 3, 'very high'=4)))
Error
The following `from` values were not present in `x`: low, medium, very high
> newdude
level
1 1
我得到了' 1'而不是' 3' 我不能做充足的
newdude$level = as.numeric(revalue(newdude$level, c( 'high'= 3)))
因为我事先无法知道它需要什么价值
如何解决?
答案 0 :(得分:2)
尝试改为
newdude = data.frame(level = factor('high', levels = c('low', 'medium', 'high', 'very high')))
newdude$level
[1] high
Levels: low medium high very high
as.numeric(newdude$level)
[1] 3