我有一个数据框(train3),包含10个数字变量和一个因子。
我想使用rfsrc
包中的randomForestSRC
函数制作随机森林分类器。
数据如下所示:
summary(train3)
roll_belt pitch_belt yaw_belt total_accel_belt gyros_belt_x gyros_belt_y
Min. :-28.90 Min. :-55.8000 Min. :-180.00 Min. : 0.00 Min. :-1.040000 Min. :-0.64000
1st Qu.: 1.10 1st Qu.: 1.7600 1st Qu.: -88.30 1st Qu.: 3.00 1st Qu.:-0.030000 1st Qu.: 0.00000
Median :113.00 Median : 5.2800 Median : -13.00 Median :17.00 Median : 0.030000 Median : 0.02000
Mean : 64.41 Mean : 0.3053 Mean : -11.21 Mean :11.31 Mean :-0.005592 Mean : 0.03959
3rd Qu.:123.00 3rd Qu.: 14.9000 3rd Qu.: 12.90 3rd Qu.:18.00 3rd Qu.: 0.110000 3rd Qu.: 0.11000
Max. :162.00 Max. : 60.3000 Max. : 179.00 Max. :29.00 Max. : 2.220000 Max. : 0.64000
gyros_belt_z accel_belt_x accel_belt_y accel_belt_z classe
Min. :-1.4600 Min. :-120.000 Min. :-69.00 Min. :-275.00 A:5580
1st Qu.:-0.2000 1st Qu.: -21.000 1st Qu.: 3.00 1st Qu.:-162.00 B:3797
Median :-0.1000 Median : -15.000 Median : 35.00 Median :-152.00 C:3422
Mean :-0.1305 Mean : -5.595 Mean : 30.15 Mean : -72.59 D:3216
3rd Qu.:-0.0200 3rd Qu.: -5.000 3rd Qu.: 61.00 3rd Qu.: 27.00 E:3607
Max. : 1.6200 Max. : 85.000 Max. :164.00 Max. : 105.00
我对rfsrc的调用如下:
fit = rfsrc (classe ~ ., data = train3)
Error in parseFormula(formula, data) :
the y-outcome must be either real or a factor.
classe似乎是一个因素:
str(train3)
Classes ‘tbl_df’ and 'data.frame': 19622 obs. of 11 variables:
$ roll_belt : num 1.41 1.41 1.42 1.48 1.48 1.45 1.42 1.42 1.43 1.45 ...
$ pitch_belt : num 8.07 8.07 8.07 8.05 8.07 8.06 8.09 8.13 8.16 8.17 ...
$ yaw_belt : num -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 ...
$ total_accel_belt: int 3 3 3 3 3 3 3 3 3 3 ...
$ gyros_belt_x : num 0 0.02 0 0.02 0.02 0.02 0.02 0.02 0.02 0.03 ...
$ gyros_belt_y : num 0 0 0 0 0.02 0 0 0 0 0 ...
$ gyros_belt_z : num -0.02 -0.02 -0.02 -0.03 -0.02 -0.02 -0.02 -0.02 -0.02 0 ...
$ accel_belt_x : int -21 -22 -20 -22 -21 -21 -22 -22 -20 -21 ...
$ accel_belt_y : int 4 4 5 3 2 4 3 4 2 4 ...
$ accel_belt_z : int 22 22 23 21 24 21 21 21 24 22 ...
$ classe : Factor w/ 5 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
我错过了什么? y结果似乎是classe,这是一个因素。