Question

我正在尝试在MFCC功能上使用SVM分类器来实现自然语音和欺骗性语音。

import scipy.io as scio
import sklearn as sk
import numpy as np

from sklearn import svm
from sklearn import preprocessing

#Load feature for training
print("Training features for Natural Speech")
mat_nat = scio.loadmat('/home/speechlab/Documents/feature_cqcc/nat_featdd_train.mat')
mat_nat_ar =  mat_nat['genuineFeatureCell']
print (mat_nat_ar.shape) #(1507,1)

print("Training features for Spoofed Speech")
mat_sp = scio.loadmat('/home/speechlab/Documents/feature_cqcc/spf_featdd_train.mat')
mat_sp_ar =  mat_sp['spoofFeatureCell']
print (mat_sp_ar.shape) #(1507,1)

#Concatenating  natural and spoofed feature array
print ("Concatenating 2 arrays \n Natural feature array followed by Spoofed Feature")
feat_con =  np.concatenate((mat_nat_ar, mat_sp_ar),axis=0)
print (feat_con.shape) # (3014,1)

scaler = preprocessing.StandardScaler()
X_train = np.array([[mat_nat_ar],[mat_sp_ar]])
X_train = scaler.fit_transform(X_train)
print (X_train)
print (type(X_train))
y_train = np.array([[0,1]])
print (type(y_train))
clf = svm.LinearSVC()
clf.fit(X_train, y_train)

我收到此错误

File "feature_cqcc.py", line 27, in <module>
    X_train = scaler.fit_transform(X_train)
  File "/usr/local/lib/python3.5/dist-packages/sklearn/base.py", line 462, in fit_transform
    return self.fit(X, **fit_params).transform(X)
  File "/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py", line 625, in fit
    return self.partial_fit(X, y)
  File "/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py", line 649, in partial_fit
    force_all_finite='allow-nan')
  File "/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py", line 527, in check_array
    array = np.asarray(array, dtype=dtype, order=order)
  File "/usr/local/lib/python3.5/dist-packages/numpy/core/numeric.py", line 501, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

我尝试将X_train和y_train转换为列表，但遇到相同的错误。我知道该错误是由于数据不匹配引起的。我该如何解决？

Answer 1

StandardScaler()仅采用二维数组。

来自documentation:

参数：X：{类似数组的稀疏矩阵}，形状为[n_samples，   [n_features]用于计算平均值和标准偏差的数据   用于以后沿要素轴缩放。

y被忽略

因此，您已经为ndarray()编写了新的自定义规范化。

python3 numpy error：ValueError：使用序列设置数组元素

1 个答案: