我们正在使用Python + SK-Learn和MLPClassifier。我们得到了相对糟糕的结果。作为一个完整性检查,我们尝试将y-desired输出添加到x-input集。在这种情况下,您可以获得100%的分数。但事实并非如此,得分相当低(20%),这比随机猜测要好得多,但仍然非常糟糕。我们有大约150个输入(大多数是布尔值)和1个输出,它是1到1500之间的整数。当我们将数字分成大约5个类别(0到4之间的整数)时,得分大约为96%。
import numpy as np
from sklearn.neural_network import MLPClassifier
from sklearn.cross_validation import train_test_split
from sklearn.preprocessing import StandardScaler
import cPickle
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
scaler = StandardScaler()
xset = np.genfromtxt('xkey.csv', delimiter=",")
yset = np.genfromtxt('ykey.csv', delimiter=",")
# yset = np.rint((5-1)*(yset)/np.max(yset))
print "Number of categories: " + str(np.max(yset)+1)
X_train, X_test, y_train, y_test = train_test_split(xset, yset, test_size=0.10)
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
clf = MLPClassifier(algorithm='adam', alpha=1e-5, hidden_layer_sizes=(100,20))
clf.fit(X_train, y_train)
score = clf.score(X_test,y_test)
print ("score: "+str(score))
ypredicted = clf.predict(X_test)
sqerr = mean_squared_error(ypredicted, y_test)
err = mean_absolute_error(ypredicted, y_test)
print ("err: " + str(err))
print ("sqerr: " + str(sqerr))