我一直在尝试使用xgboost
训练此数据集。但是,当我将其转换为稀疏矩阵时,会出现以下错误消息;
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The length of labels must equal to the number of rows in the input data
我非常困惑,因为标签是从数据集派生的-因此,我不明白它与稀疏矩阵的长度有何不同。
据我所知-数据框有2048行,从它派生的标签也是如此。但是,当我将其变成稀疏矩阵时,将添加300行。
谁能想到解决此问题的修补程序?
require(xgboost)
require(methods)
require(Matrix)
require(data.table)
require(vcd)
require(dplyr)
train = read.csv("French Ligue 1 train.csv", header = TRUE, stringsAsFactors = F)
test = read.csv("French Ligue 1 test.csv", header = TRUE, stringsAsFactors = F)
df <- data.table(train, keep.rownames = F)
sparse_matrix <- sparse.model.matrix(Response ~.-1, data = df)
output_vector = sparse_matrix[,Response] == 1
bst <- xgboost(data = sparse_matrix, label = output_vector, max.depth = 4,
eta = 1, nthread = 2, nrounds = 4, objective = "binary:logistic")