我正在尝试在MFCC功能上使用SVM分类器来实现自然语音和欺骗性语音。
import scipy.io as scio
import sklearn as sk
import numpy as np
from sklearn import svm
from sklearn import preprocessing
#Load feature for training
print("Training features for Natural Speech")
mat_nat = scio.loadmat('/home/speechlab/Documents/feature_cqcc/nat_featdd_train.mat')
mat_nat_ar = mat_nat['genuineFeatureCell']
print (mat_nat_ar.shape) #(1507,1)
print("Training features for Spoofed Speech")
mat_sp = scio.loadmat('/home/speechlab/Documents/feature_cqcc/spf_featdd_train.mat')
mat_sp_ar = mat_sp['spoofFeatureCell']
print (mat_sp_ar.shape) #(1507,1)
#Concatenating natural and spoofed feature array
print ("Concatenating 2 arrays \n Natural feature array followed by Spoofed Feature")
feat_con = np.concatenate((mat_nat_ar, mat_sp_ar),axis=0)
print (feat_con.shape) # (3014,1)
scaler = preprocessing.StandardScaler()
X_train = np.array([[mat_nat_ar],[mat_sp_ar]])
X_train = scaler.fit_transform(X_train)
print (X_train)
print (type(X_train))
y_train = np.array([[0,1]])
print (type(y_train))
clf = svm.LinearSVC()
clf.fit(X_train, y_train)
我收到此错误
File "feature_cqcc.py", line 27, in <module>
X_train = scaler.fit_transform(X_train)
File "/usr/local/lib/python3.5/dist-packages/sklearn/base.py", line 462, in fit_transform
return self.fit(X, **fit_params).transform(X)
File "/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py", line 625, in fit
return self.partial_fit(X, y)
File "/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/data.py", line 649, in partial_fit
force_all_finite='allow-nan')
File "/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py", line 527, in check_array
array = np.asarray(array, dtype=dtype, order=order)
File "/usr/local/lib/python3.5/dist-packages/numpy/core/numeric.py", line 501, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
我尝试将X_train和y_train转换为列表,但遇到相同的错误。我知道该错误是由于数据不匹配引起的。我该如何解决?
答案 0 :(得分:0)
StandardScaler()
仅采用二维数组。
参数:X:{类似数组的稀疏矩阵},形状为[n_samples, [n_features]用于计算平均值和标准偏差的数据 用于以后沿要素轴缩放。
y被忽略
因此,您已经为ndarray()
编写了新的自定义规范化。