训练LightGBM进行多类别多分类的维度相关问题?

时间:2019-10-20 07:50:27

标签: scikit-learn training-data dimensions multiclass-classification lightgbm

我想通过LightGBM算法对Multiclass Multilable分类进行分类,但是在训练过程中遇到了一个问题,因为它不是输入列表。 DATA  是sampling rows实际行的长度是10000

dataset = pd.read_csv('Data.csv') 
X = dataset.iloc[:,np.r_[0:6, 7:27]].values
y = dataset.iloc[:,np.r_[6]].values

x_train, x_test, y_train, y_test = train_test_split(X, y,test_size = 0.25, random_state = 0)
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)
import lightgbm as lgb
d_train = lgb.Dataset(x_train, label=y_train)
params = {}
params['learning_rate'] = 0.003
params['boosting_type'] = 'gbdt'
params['objective'] = 'binary'
params['metric'] = 'binary_logloss'
params['sub_feature'] = 0.5
params['num_leaves'] = 10
params['min_data'] = 50
params['max_depth'] = 10
clf = lgb.train(params, d_train, 100)

y_pred=clf.predict(x_test)

for i in range(0,99):
 if y_pred[i]>=.5:
    y_pred[i]=1
 else:
    y_pred[i]=0

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

我遇到了这个问题:

    clf = lgb.train(params, d_train, 100)
  File "..\lightgbm\engine.py", line 228, in train
    ...
  File "..\lightgbm\basic.py", line 1336, in set_label
    label = list_to_1d_numpy(_label_from_pandas(label), name='label')
  File "..\lightgbm\basic.py", line 86, in list_to_1d_numpy
    "It should be list, numpy 1-D array or pandas Series".format(type(data).__name__, name))

basic.py中的函数中发现此错误:“”“将数据转换为numpy一维数组。”“”当我将数据更改为1D时

y_train = np.reshape(y_train, [1,trainsize])
x_train = np.reshape(x_train, [1,trainsize*26])

问题没有解决! 然后,我用ravelx_train, y_train制作一维

x_train = np.ravel(x_train)
y_train = np.ravel(y_train)

但显示新错误:

  

\ lib \ site-packages \ lightgbm \ basic.py”,第872行,位于__init_from_np2d中           引发ValueError('Input numpy.ndarray必须是二维的')
      ValueError:输入numpy.ndarray必须为二维

怎么了?我该如何解决?

0 个答案:

没有答案