我一直在使用 gbm 通过插入符号而没有任何问题,但是当从我的数据框中删除一些变量时,它开始失败。我已经尝试了上述软件包的github和cran版本。
这是错误:
> fitRF = train(my_data[trainIndex,vars_for_clust], clusterAssignment[trainIndex], method = "gbm", verbose=T)
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :9 NA's :9
Error in train.default(my_data[trainIndex, vars_for_clust], clusterAssignment[trainIndex], :
Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In eval(expr, envir, enclos) :
model fit failed for Resample01: shrinkage=0.1, interaction.depth=1, n.minobsinnode=10, n.trees=150 Error in gbm.fit(x = structure(list(relatedness_cottle = c(0, 0, 8, 6, :
unused arguments (x = list(relatedness_cottle = c(0, 0, 8, 6, 0, 6, 8, 10, 10, 6, 6, 4, 4, 4, 0, 0, 0, 0, 18, 18, 18, 0, 0, 6, 6, 0, 18, 12, 0, 4, 4, 4, 0, 0, 0, 18, 18, 6, 4, 4, 4, 6, 8, 6, 6, 0, 14, 2, 0, 8, 6, 6, 0, 4, 0, 0, 0, 0, 0, 4, 8, 8, 8, 4, 18, 0, 0, 4, 10, 18, 6, 0, 0, 18, 10, 10, 6, 2, 4, 4, 10, 10, 10, 2, 8, 0, 0, 0, 0, 10, 6, 6, 0, 4, 4, 0, 0, 0, 0, 8, 0, 0, 4, 4, 6, 6, 10, 6, 0, 0, 6, 4, 4, 8, 0, 12, 6, 2, 2, 8, 8, 4, 4, 4, 4, 6, 2, 2, 4, 0, 6, 0, 0, 0, 12, 18, 8, 0, 0, 4, 4, 2, 0, 0, 0, 0, 18,
12, 6, 6, 4, 4, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6, 18, 0, 0, 18, 6, 4, 2, 2, 0, 0, 10, 0, 0, 0, 12, 4, 4, 4, 4, 4, 8, 18, 6, 18, 18, 12, 12, 12, 0, 0, 0, 0, 10, 12, 12, 12, 12, 12, 4, 4, 4, 6, 6, 6, 6, 12, 0, 6, 0, 0, 4, 4, 18, 18, 18, 0, 0, 4, 6, 6, 0, 0, 2, 0, 0, 0, 18, 12, 12, 0, 0, 0, 0, 0, 0, 18 [... truncated]
没有缺失值,响应是4级因子,输入如下:
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 1165 obs. of 14 variables:
$ relatedness_cottle : num 0 0 8 8 0 6 0 6 6 0 ...
$ dominance_cottle : int 4 6 0 6 6 6 6 4 4 4 ...
$ time_spent : num 26832 20822 18893 13107 25406 ...
$ num_color_changes : num 3.33 2.33 1.33 1 1 ...
$ num_selects : num 1 0.667 2 0.667 1.667 ...
$ show_select_match : num 1 0.667 0.333 1 1 ...
$ default_size : num 0.667 0 0.667 0 0 ...
$ select_order : Factor w/ 6 levels "future_past_present",..: 1 4 4 2 5 1 4 6 6 4 ...
$ order_x : Factor w/ 6 levels "future_past_present",..: 4 4 4 4 4 3 4 4 4 4 ...
$ color_past : Factor w/ 8 levels "black","blue",..: 5 1 6 8 5 7 1 6 6 5 ...
$ color_present : Factor w/ 8 levels "black","blue",..: 1 4 4 4 6 8 4 4 1 4 ...
$ color_future : Factor w/ 8 levels "black","blue",..: 2 2 2 2 2 2 1 2 8 2 ...
$ dominance_cottle_future : int 0 4 0 4 2 0 4 2 2 0 ...
$ relatedness_cottle_future: int 0 2 4 4 0 4 0 2 4 0 ...
但是,如果我直接使用数据框调用 gbm ,它就会起作用:
summary(gbm(clusterAssignment[trainIndex] ~ ., data = my_data[trainIndex,vars_for_clust]))
Distribution not specified, assuming multinomial ...
var rel.inf
color_present color_present 33.533673
dominance_cottle dominance_cottle 33.170138
default_size default_size 25.321566
dominance_cottle_future dominance_cottle_future 5.674563
color_future color_future 2.300060
relatedness_cottle relatedness_cottle 0.000000
time_spent time_spent 0.000000
num_color_changes num_color_changes 0.000000
num_selects num_selects 0.000000
show_select_match show_select_match 0.000000
select_order select_order 0.000000
order_x order_x 0.000000
color_past color_past 0.000000
relatedness_cottle_future relatedness_cottle_future 0.000000
答案 0 :(得分:4)
目前,将数据帧从plyr / dplyr转换为使用sorted(item, key=lambda x: x.id)
的普通数据框可以解决问题。
as.data.frame()
答案 1 :(得分:2)
与glm方法相同的问题。删除VERBOSE选项后解决...
答案 2 :(得分:0)
使用某些caret
方法,当用户尝试使用多项分类进行预测时,会出现此问题,并且算法中只允许二进制{0,1}
结果,或者当前的参数集。