数组维度

时间:2015-09-17 08:23:30

标签: python arrays python-3.x numpy scikit-learn

我计算一个数组,它的形状= 800 * 1140。从前一步骤生成ndarray,并使用hstack堆叠元素。我需要将其插入scikitlearn进行培训,我有以下错误:

ValueError:找到dim为1140的数组。预计为800

我认为我的错误可能与this相似,但我不知道如何继续。

有人可以给我指点吗?以下是导致错误的代码:运行XTrain行时出错......

X_scaled = preprocessing.scale(self.featureMatrix)
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
X_scaled = imp.fit_transform(X_scaled)
classiFier = svm.SVC(C=10, cache_size=1500, class_weight=None, coef0=0.0, degree=3, gamma=0.0, kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)
XTrain, XTest, yTrain, yTest = cv.train_test_split(X_scaled,
                                                       self.classID,
                                                       test_size=0.4,
                                                       random_state=0)

以下是整个Traceback:

Traceback (most recent call last):
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 
4.5.3\helpers\pydev\pydevd.py", line 2358, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition
4.5.3\helpers\pydev\pydevd.py", line 1778, in run
pydev_imports.execfile(file, globals, locals)  # execute the script
File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition#
4.5.3\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc) 
File "C:/Users/vaidvj/svn/idmt/core/wrappers/afp/test/data/Urban_Sound_DB
/train&testSVM_MF.py", line 133, in <module>
c.process()
File "C:/Users/vaidvj/svn/idmt/core/wrappers/afp/test/data/Urban_Sound_DB
/train&testSVM_MF.py", line 129, in process
self.confMatcal( self.MfeatureMatrix, self.classID, self.uniqueClassLabels)
File "C:/Users/vaidvj/svn/idmt/core/wrappers/afp/test/data/Urban_Sound_DB/train&testSVM_MF.py", line 49, in confMatcal
random_state=0)
File "C:\Anaconda3\lib\site-packages\sklearn\cross_validation.py", line 1556, in train_test_split
arrays = check_arrays(*arrays, **options)
File "C:\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 254, in check_arrays
% (size, n_samples))
ValueError: Found array with dim 1140. Expected 800
谢谢你。

1 个答案:

答案 0 :(得分:0)

您应该只使用numpy.transpose()yourArray.T来转置数据。 scikit需要一个形状为(n_samples, n_features)的数组,其中n_samples是您的观察数量,n_features是他们所居住空间的维度。

有关示例,请参阅the doc of np.transpose()