我正在使用威斯康星州的乳腺癌数据,并在keras库中使用ANN进行处理。
我在下面添加了代码和部分数据,希望它们可读易懂。
数据集中的一行:y
预测结果:
测试损失:0.05948319600096771-测试准确性:0.9809523820877075
As seen, confusion matrix looks fine.
在导入库之后,这是代码的主要部分:
1000025,5,1,1,1,2,1,3,1,1,2
图形和混淆矩阵的图:
data = pd.read_csv('breast-cancer-wisconsin.data')
data_new = data.drop(['1000025'],axis=1)
X = data_new.iloc[:,0:8].values
Y = data_new.iloc[:,9].values
labelencoder_= LabelEncoder()
Y = labelencoder_.fit_transform(Y)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.15, random_state = 0)
#Feature Scaling
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
classifier = Sequential()
classifier.add(Dense(8, input_dim=8))
classifier.add(Activation("relu"))
classifier.add(Dropout(0.1))
classifier.add(Dense(32))
classifier.add(Activation("relu"))
classifier.add(Dropout(0.1))
classifier.add(Dense(16))
classifier.add(Activation("relu"))
classifier.add(Dropout(0.1))
classifier.add(Dense(1))
classifier.add(Activation("sigmoid"))
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = classifier.fit(X_train, y_train, epochs=100, batch_size=10, validation_split=0.11)
test_loss, test_acc = classifier.evaluate(X_test, y_test)
print('\nTest Loss:', test_loss)
print('Test Accuracy:', test_acc)
y_pred = classifier.predict(X_test)