我面临以下问题,我首先要为您提供代码,然后详细解释:
#Just try to implement the modular
from keras.models import Sequential
from keras.layers import Convolution1D, MaxPooling1D
from keras.layers import Dense, Dropout, Activation, Flatten, BatchNormalization
from keras.optimizers import SGD
import numpy
from numpy import newaxis
dataset = numpy.loadtxt("example.csv", delimiter = ",")
X = dataset[:, 0:200]
Y = dataset[:, 200]
s1 = X.shape[0]
s2 = X.shape[1]
newshape = (s1, s2, 1)
X = numpy.reshape(X, newshape)
#print(X.shape[2])
model = Sequential()
model.add(Convolution1D(16, 3, border_mode = "same", input_shape = (200, 1)))
#model.add(Dense(12, input_dim=200, init='uniform', activation='relu'))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling1D(pool_length = 2))
model.add(Convolution1D(32, 3, border_mode = "same"))
model.add(Convolution1D(32, 3, border_mode = "same"))
model.add(Activation('relu'))
model.add(MaxPooling1D(pool_length = 2))
model.add(Convolution1D(32, 3, border_mode = "same", activation = 'tanh'))
model.add(Convolution1D(32, 3, border_mode = "same", activation = 'tanh'))
model.add(Flatten())
model.add(BatchNormalization())
model.add(Dense(100, activation = 'tanh'))
model.add(Dropout(0.2))
model.add(Dense(50, activation = 'tanh'))
model.add(Dropout(0.2))
model.add(Dense(20, activation = 'tanh'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))
print("here1")
sgd = SGD(lr=0.1, decay=0.001, momentum=0.9, nesterov=True)
model.compile(loss = "binary_crossentropy", optimizer = sgd, metrics = ['accuracy'] )
print('here2')
model.fit(X, Y, batch_size = 64, nb_epoch = 1)
#print("here3")
#scores = model.evaluate(X, Y)
score = model.evaluate(X, Y, verbose = 0)
print(score)
output = model.predict(X, batch_size = 20, verbose = 0)
print(output[0:100])
#print("%s: %.2f%%" % (model.metrics_names[1], score[1]*100))
#scores = model.evaluate(X, Y)
我正在做的如下:作为输入(X),我提供网络DNA代码(编码为数字),标签(Y)是二进制(0或1)。我想预测Y.当我运行模型时,它表现得非常奇怪,至少在某种我无法理解的方式:
现在来看,这是我的问题:关于预测的标签输出(线的结果 打印(输出[0:100]) ) 模型总是预测为0.然而,如上所述,准确度似乎非常高。这是为什么?请注意,数据集是平衡的,这意味着一半的观测值标记为1,其中一半标记为0.因此,将所有值预测为0应该会导致准确率为0.5。
编辑:
当我被要求提供数据时,这里有一个截图。每行的最后一个数字是标签。
答案 0 :(得分:0)
也许您的数据没有正确缩放。作为调试步骤,您可以在最后一层使用 linear 激活功能并查看结果。