我正在尝试使用sklearn进行线性判别分析,但是lda.fit_transform方法抛出了意外错误。这是我的代码:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
td_y = np.array([0.01749199, 0.02318867, 0.01573475, 0.01466889, 0.0132333 ])
train_data_t = [[-0.08489971, 0.78123119, -0.78884559, 0.27773261, 0.4490089, -0.26513558],
[-0.17040665, 0.42349963, -0.74977033, -0.10699659, 0.31261445, -0.25117359],
[-0.271049, 0.16313721, -0.83591126, -0.13778149, 0.46943721, -0.23717101],
[-0.24065918, -0.09322174, -0.76637149, -0.34056252, 0.44016493, -0.22645931],
[-0.14514733, -0.3334085, -0.66783516, -0.11382202, 0.42566331, -0.21868078]]
print(td_y[:5])
print(type(td_y[0]))
print(type(td_y))
print(train_data_t[:5])
print(type(train_data_t[0][0]))
print(type(train_data_t))
lda = LDA(n_components=2)
train_data_out = lda.fit_transform(train_data_t[:5], td_y[:5])
这是输出:
[0.01749199 0.02318867 0.01573475 0.01466889 0.0132333 ]
<class 'numpy.float64'>
<class 'numpy.ndarray'>
[[-0.08489971 0.78123119 -0.78884559 0.27773261 0.4490089 -0.26513558]
[-0.17040665 0.42349963 -0.74977033 -0.10699659 0.31261445 -0.25117359]
[-0.271049 0.16313721 -0.83591126 -0.13778149 0.46943721 -0.23717101]
[-0.24065918 -0.09322174 -0.76637149 -0.34056252 0.44016493 -0.22645931]
[-0.14514733 -0.3334085 -0.66783516 -0.11382202 0.42566331 -0.21868078]]
<class 'numpy.float64'>
<class 'numpy.ndarray'>
Traceback (most recent call last):
raise ValueError("Unknown label type: %s" % repr(ys))
ValueError: Unknown label type: (array([0.01749199, 0.02318867, 0.01573475, 0.01466889, 0.0132333 ]),)
我完全困惑为什么会发生此错误。 fit_transform的文档表明,我提供的X和y输入没有问题,但是代码引发了此错误。似乎暗示y输入中有重复项(repr(ys))或y输入是字符串(类型:%s)。
已报告错误here,但解决方案似乎与我的代码无关。
我也尝试了np.vstack(td_y),但是抛出了同样的错误。
如果有人可以帮助我进行调试,我将不胜感激。