我试图在Tensorflow上构建我的第一个Keras深度神经网络,并希望使用Flask进行部署。我获取了航空公司的示例数据,并希望预测航班是否延误。首先,我仅使用示例列:Year, Month, DayOfWeek, UniqueCarrier, FlightNum, Origin, Dest, Distance
。使用标签编码器将UniqueCarrier, Origin, Dest
转换为数值。然后,运行以下程序之后,发现精度为93%。但是,当我通过rest api发送参数手动运行预测时,我总是得到1作为输出。不知道需要做什么。
下面是一些代码和示例输出:
le = LabelEncoder()
data["UniqueCarrier"] = le.fit_transform(data["UniqueCarrier"])
UniqueCarrier = list(le.classes_)
print(UniqueCarrier)
data["Origin"] = le.fit_transform(data["Origin"])
Carrier = list(le.classes_)
print(Carrier)
data["Dest"] = le.fit_transform(data["Dest"])
TailNum = list(le.classes_)
print(TailNum)
数据已设置为预测变量和目标:
rfDataOriginal = pd.DataFrame(data)
Delay_YesNo = rfDataOriginal['IsDepDelayed']
rfDataOriginal.drop(['IsDepDelayed'], axis=1, inplace=True)
删除目标变量:
print('Dimension reduced to:')
print(len(rfDataOriginal.columns))
功能扩展:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
创建模型:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(15, input_dim=12, activation='relu'))
model.add(Dense(15, activation='relu'))
model.add(Dense(15, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()
编译模型:
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
为Tensorboard图形目的登录:
import keras
tbCallBack = keras.callbacks.TensorBoard(log_dir='/tmp/keras_logs', write_graph=True)
拟合模型:
model.fit(X_train, y_train, epochs=5, batch_size=30, verbose=1, callbacks=[tbCallBack])
创建混淆矩阵:
from sklearn.metrics import confusion_matrix,accuracy_score
cm = confusion_matrix(y_test, Y_pred)
print("\nConfusion Matrix:")
print(cm)
acs = accuracy_score(y_test, Y_pred)
print("\nAccuracy Score: %.2f%%" % (acs * 100))
Confusion Matrix:
[[41614 322]
[ 5664 35894]]
Accuracy Score: 92.83%
通过传递参数进行预测测试:
inputFeature = [1989, 9, 14, 1719, 1720, 1845, 1859, 11, 927, 58, 68, 997]
inputFeature = np.asarray(inputFeature).reshape(1, 12)
model.predict(inputFeature)
Output: array([[ 1.]], dtype=float32)
inputFeature = [1989, 11, 24, 1144, 1144, 1633, 1635, 0, 816, 213, 59, 1205]
inputFeature = np.asarray(inputFeature).reshape(1, 12)
model.predict(inputFeature)
array([[ 1.]], dtype=float32)