Keras预测每次都会返回相同的结果

时间:2018-12-20 13:06:27

标签: python tensorflow machine-learning keras neural-network

我是Keras的新朋友。刚从本地开始,示例取自here。示例数据可以正常工作。然后我修改了一些代码以适应我的数据(在我的数据文件结果列中排第一)。然后,当我再次运行并尝试预测输入时,它总是为每个输入行返回相同的结果-[1. 0.], [1. 0.] ...。这是我的代码:

import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping
from keras.utils import to_categorical

#read in training data
train_df_2 = pd.read_csv('/Users/my_user/python-workspace/Deep-Learning-in-Keras-Tutorial/data/my_data.csv')

#view data structure
train_df_2.head()

#create a dataframe with all training data except the target column
train_X_2 = train_df_2.drop(columns=['result'])

target = train_df_2[['result']]

#check that the target variable has been removed
train_X_2.head()

#one-hot encode target column
train_y_2 = to_categorical(train_df_2.result)

#create model
model_2 = Sequential()

#get number of columns in training data
n_cols_2 = train_X_2.shape[1]

#add layers to model
model_2.add(Dense(25, activation='relu', input_shape=(n_cols_2,)))
model_2.add(Dense(25, activation='relu'))
model_2.add(Dense(2, activation='softmax'))
# model_2.add(Dense(10, input_dim=n_cols_2, kernel_initializer='normal', activation='relu'))
# model_2.add(Dense(25, activation='relu'))
# model_2.add(Dense(2, activation='softmax'))

#compile model using accuracy to measure model performance
model_2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

#set early stopping monitor so the model stops training when it won't improve anymore
early_stopping_monitor = EarlyStopping(patience=3)

#train model
model_2.fit(train_X_2, train_y_2, epochs=30, validation_split=0.1, callbacks=[early_stopping_monitor])

p = model_2.predict(train_X_2, verbose=0, batch_size=1)
print(p)

我的输入数据示例:

result,i1,i2,i3,i4
0,1770,2390,1750,1816
1,1675,2540,2029,1940
1,1770,2384,1765,1770
0,1690,2485,2075,1900
0,1680,2465,2050,1920
0,1770,2395,1744,1795
1,1675,2490,2050,1915
0,1768,2400,1740,1790
0,1675,2525,2050,1910 
.... (total 2312 rows)

为什么它总是为每一行返回相同的结果[1. 0.]?我希望至少有[0. 1.]行。我在做什么错了?

2 个答案:

答案 0 :(得分:2)

您尚未规范化输入数据。因此,它将阻碍训练过程并破坏梯度更新,并且您的模型可能一无所获。尝试使用类似sklearn.preprocessing.StandardScaler的方法对其进行规范化。或者,您也可以手动进行操作:

mean = train_X_2.mean(axis=0)
train_X_2 -= mean
std = train_X_2.std(axis=0)
train_X_2 /= std

答案 1 :(得分:0)

我在div.querySelectorAll数据集中使用了具有二进制结果的模型(前100行目标中只有2种类型):

iris

输出完美:

def getmodel(n_cols_2): 
    from keras.models import Sequential
    from keras.layers import Dense
    from keras.utils import to_categorical
    #create model
    model_2 = Sequential()
    #add layers to model
    model_2.add(Dense(25, activation='relu', input_shape=(n_cols_2,)))
    model_2.add(Dense(25, activation='relu'))
    model_2.add(Dense(1, activation='sigmoid')) 
    #compile model using accuracy to measure model performance
    model_2.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model_2

from sklearn import datasets
iris = datasets.load_iris()
train_X_2 = iris.data[:100]
train_y_2 = iris.target[:100]
# print top 5 rows:
print(train_X_2[:5])
print(train_y_2[:5])
n_cols = iris.data.shape[1]
# get model: 
model = getmodel(n_cols)
# set early stopping monitor so the model stops training when it won't improve anymore
from keras.callbacks import EarlyStopping
early_stopping_monitor = EarlyStopping(patience=3)
#train model
model.fit(train_X_2, train_y_2, epochs=30, batch_size=10 , validation_split=0.2) # , callbacks=[early_stopping_monitor])
# predict and print classes
p = model.predict_classes(train_X_2, verbose=0, batch_size=10)
print(p.ravel())

请注意,此处未使用标准化。缺乏标准化可能会影响预测,但不能使神经网络无法正常工作(如您的情况)。

还要注意,我使用了[[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2]] [0 0 0 0 0] Using TensorFlow backend. Train on 80 samples, validate on 20 samples Epoch 1/100 80/80 [==============================] - 0s 3ms/step - loss: 0.7615 - acc: 0.3750 - val_loss: 0.4282 - val_acc: 1.0000 Epoch 2/100 80/80 [==============================] - 0s 88us/step - loss: 0.6658 - acc: 0.3750 - val_loss: 0.4944 - val_acc: 1.0000 Epoch 3/100 ... ... [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1] sigmoidbinary_crossentropy

通常使用圆锥形,因此您的第二密集层可能只有12个左右的神经元。另外,可以通过在每个密集层之后添加Dropout层来提高准确性。

如果您仍然获得全0或1,则可能是您的数据是相当随机的,并没有真正预测目标。