Question

我正在为Alexnet＆＃39;运行代码在Keras（Theano后端）：

model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])#, AUCEpoch,PrecisionEpoch,RecallEpoch,F1Epoch])
print(X_train.shape)
print(model.summary())

debug result:
(268, 3, 227, 227)
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 3, 227, 227)   0                                            
____________________________________________________________________________________________________
conv_1 (Convolution2D)           (None, -1, 55, 96)    2636928     input_1[0][0]                    
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, -1, 27, 96)    0           conv_1[0][0]                     
____________________________________________________________________________________________________
convpool_1 (Lambda)              (None, -1, 27, 96)    0           maxpooling2d_1[0][0]             
____________________________________________________________________________________________________
zeropadding2d_1 (ZeroPadding2D)  (None, 3, 31, 96)     0           convpool_1[0][0]                 
____________________________________________________________________________________________________
lambda_1 (Lambda)                (None, 1, 31, 96)     0           zeropadding2d_1[0][0]            
____________________________________________________________________________________________________
lambda_2 (Lambda)                (None, 1, 31, 96)     0           zeropadding2d_1[0][0]            
____________________________________________________________________________________________________
conv_2_1 (Convolution2D)         (None, -3, 27, 128)   307328      lambda_1[0][0]                   
____________________________________________________________________________________________________
conv_2_2 (Convolution2D)         (None, -3, 27, 128)   307328      lambda_2[0][0]                   
____________________________________________________________________________________________________
conv_2 (Merge)                   (None, -6, 27, 128)   0           conv_2_1[0][0]                   
                                                                   conv_2_2[0][0]                   
______________________________________________________________________________.......................
.......................
______________________________________________________________________________                      
mil_1 (Convolution2D)            (None, -5, 6, 128)    16512       convpool_5[0][0]                 
____________________________________________________________________________________________________
mil_2 (Convolution2D)            (None, -5, 6, 128)    16512       mil_1[0][0]                      
____________________________________________________________________________________________________
mil_3 (Convolution2D)            (None, -5, 6, 2)      258         mil_2[0][0]                      
____________________________________________________________________________________________________
softmax (Softmax4D)              (None, -5, 6, 2)      0           mil_3[0][0]                      
____________________________________________________________________________________________________
output (MaxPooling2D)            (None, -1, 1, 2)      0           softmax[0][0]                    
____________________________________________________________________________________________________
flatten (Flatten)                (None, -2)            0           output[0][0]                     
____________________________________________________________________________________________________
Recalcmil (Recalc)               (None, -2)            0           flatten[0][0]                    
====================================================================================================
Total params: 5,497,730
Trainable params: 5,497,730
Non-trainable params: 0
____________________________________________________________________________________________________
None

ValueError: Error when checking model target: expected Recalcmil to have shape (None, -2) but got array with shape (268, 2)

我真的对natwork的输出形状感到困惑。我不明白是什么导致形状的负数。

以下是该模型的源代码：

np.random.seed(1)
#srng = RandomStreams(1)
fold = 2 # 4
valfold = 4
lr = 5e-5
nb_epoch = 500
batch_size = 80
l2factor = 1e-5
l1factor = 0#2e-7
weighted = False #True
noises = 50
#data_augmentation = True
data_augmentation = False

modelname = 'alexnet' # miccai16, alexnet, levynet, googlenet
#pretrain = True
pretrain = False

mil=True
savename = modelname+'_fd'+str(fold)+'_vf'+str(valfold)+'_lr'+str(lr)+'_l2'+str(l2factor)+'_l1'\
+str(l1factor)+'_ep'+str(nb_epoch)+'_bs'+str(batch_size)+'_w'+str(weighted)+'_dr'+str(False)+str(noises)+str(pretrain)+'_mil'+str(mil)
print(savename)
nb_classes = 2
# input image dimensions
img_rows, img_cols = 227, 227
# the CIFAR10 images are RGB
img_channels = 1
# the data, shuffled and split between train and test sets
trX, y_train, teX, y_test, teteX, y_test_test = inbreast.loaddataenhance(fold, 5, valfold=valfold)
trY = y_train.reshape((y_train.shape[0],1))
teY = y_test.reshape((y_test.shape[0],1))
teteY = y_test_test.reshape((y_test_test.shape[0],1))
print('tr, val, te pos num and shape')
print(trY.sum(), teY.sum(), teteY.sum(), trY.shape[0], teY.shape[0], teteY.shape[0])
ratio = trY.sum()*1./trY.shape[0]*1.
print('tr ratio'+str(ratio))
weights = np.array((ratio, 1-ratio))
# trYori = np.concatenate((1-trY, trY), axis=1)
# teY = np.concatenate((1-teY, teY), axis=1)
# teteY = np.concatenate((1-teteY, teteY), axis=1)

X_train = trX.reshape(-1, img_channels, img_rows, img_cols)
X_test = teX.reshape(-1, img_channels, img_rows, img_cols)
X_test_test = teteX.reshape(-1, img_channels, img_rows, img_cols)


# X_train = trX.reshape(-1, img_rows, img_cols, img_channels)
# X_test = teX.reshape(-1, img_rows, img_cols, img_channels
# X_test_test = teteX.reshape(-1, img_rows, img_cols, img_channels)


print('tr, val, te mean, std')
print(X_train.mean(), X_test.mean(), X_test_test.mean())
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
Y_test_test = np_utils.to_categorical(y_test_test, nb_classes)
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'val samples')
print(X_test_test.shape[0], 'test samples')
model = Sequential()
if modelname == 'alexnet':
  X_train_extend = np.zeros((X_train.shape[0],3, 227, 227))
  for i in xrange(X_train.shape[0]):
    rex = np.resize(X_train[i,:,:,:], (227, 227))
    X_train_extend[i,0,:,:] = rex
    X_train_extend[i,1,:,:] = rex
    X_train_extend[i,2,:,:] = rex
  X_train = X_train_extend
  X_test_extend = np.zeros((X_test.shape[0], 3,227, 227))
  for i in xrange(X_test.shape[0]):
    rex = np.resize(X_test[i,:,:,:], (227, 227))
    X_test_extend[i,0,:,:] = rex
    X_test_extend[i,1,:,:] = rex
    X_test_extend[i,2,:,:] = rex
  X_test = X_test_extend
  X_test_test_extend = np.zeros((X_test_test.shape[0], 3, 227, 227))
  for i in xrange(X_test_test.shape[0]):
    rex = np.resize(X_test_test[i,:,:,:], (227,227))
    X_test_test_extend[i,0,:,:] = rex
    X_test_test_extend[i,1,:,:] = rex
    X_test_test_extend[i,2,:,:] = rex
  X_test_test = X_test_test_extend
  if pretrain:  # 227*227
    alexmodel = convnet('alexnet', weights_path='alexnet_weights.h5', heatmap=False, l1=l1factor, l2=l2factor)
    model = convnet('alexnet', outdim=2, l1=l1factor, l2=l2factor, usemil=mil)
    for layer, mylayer in zip(alexmodel.layers, model.layers):
      print(layer.name)
      if mylayer.name == 'mil_1':
        break
      else:
        weightsval = layer.get_weights()
        print(len(weightsval))
        mylayer.set_weights(weightsval)
  else:
    model = convnet('alexnet', outdim=2, l1=l1factor,l2=l2factor, usemil=mil)

# let's train the model using SGD + momentum (how original).
sgd = Adam(lr=lr) #SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])#, AUCEpoch,PrecisionEpoch,RecallEpoch,F1Epoch])
print(X_train.shape)
print(model.summary())
#filepath = savename+'-{epoch:02d}-{val_loss:.2f}-{val_acc:.2f}.hdf5' #-{val_auc:.2f}-\
#{val_prec:.2f}-{val_reca:.2f}-{val_f1:.2f}.hdf5'
#checkpoint0 = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='max')
#checkpoint1 = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
checkpoint0 = LossEpoch(savename, validation_data=(X_test, Y_test), interval=1)
checkpoint1 = ACCEpoch(savename, validation_data=(X_test, Y_test), interval=1)
checkpoint2 = AUCEpoch(savename, validation_data=(X_test, Y_test), interval=1)
checkpoint3 = PrecisionEpoch(savename, validation_data=(X_test, Y_test), interval=1)
checkpoint4 = RecallEpoch(savename, validation_data=(X_test, Y_test), interval=1)
checkpoint5 = F1Epoch(savename, validation_data=(X_test, Y_test), interval=1)
#checkpoint2 = ModelCheckpoint(filepath, monitor='val_auc', verbose=1, save_best_only=True, mode='max')
#checkpoint3 = ModelCheckpoint(filepath, monitor='val_prec', verbose=1, save_best_only=True, mode='max')
#checkpoint4 = ModelCheckpoint(filepath, monitor='val_reca', verbose=1, save_best_only=True, mode='max')
#checkpoint5 = ModelCheckpoint(filepath, monitor='val_f1', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint0, checkpoint1, checkpoint2, checkpoint3, checkpoint4, checkpoint5]
#callbacks_list = [AUCEpoch, PrecisionEpoch, RecallEpoch, F1Epoch, checkpoint0, checkpoint1]
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
#X_train /= 255
#X_test /= 255

if not data_augmentation:
  print('Not using data augmentation.')
  model.fit(X_train, Y_train,
              batch_size=batch_size,
              nb_epoch=nb_epoch,
              validation_data=(X_test, Y_test),
              shuffle=True)
else:
  print('Using real-time data augmentation.')
  # this will do preprocessing and realtime data augmentation
  datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=45.0,  # randomly rotate images in the range (degrees, 0 to 180)
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False,
        zerosquare=True,
        zerosquareh=noises,
        zerosquarew=noises,
        zerosquareintern=0.0)  # randomly flip images
  # compute quantities required for featurewise normalization
  # (std, mean, and principal components if ZCA whitening is applied)
  datagen.fit(X_train)
  # fit the model on the batches generated by datagen.flow()

  if weighted:
    model.fit_generator(datagen.flow(X_train, Y_train,
                        batch_size=batch_size),
                        samples_per_epoch=X_train.shape[0],
                        nb_epoch=nb_epoch,
                        validation_data=(X_test, Y_test),
                        callbacks=callbacks_list,
                        class_weight=[weights[0], weights[1]])
  else:
    print('X_train shape:', X_train.shape)
    print('Y_train shape:', Y_train.shape)

    print('X_test shape:', X_test.shape)
    print('Y_test shape:', Y_test.shape)
    print(X_train.shape[0], 'train samples')
    print(X_test.shape[0], 'val samples')
    print(X_test_test.shape[0], 'test samples')
    model.fit_generator(datagen.flow(X_train, Y_train,
                        batch_size=batch_size),
                        samples_per_epoch=X_train.shape[0],
                        nb_epoch=nb_epoch,
                        validation_data=(X_test, Y_test),
                        callbacks=callbacks_list)

我认为我输入的格式正确（无，3,227,227），但输出错误。

这是＆＃39; alexnet＆＃39;：

def AlexNet(outdim=1000, weights_path=None, heatmap=False, l1=0, l2=0, usemil=False, usemymil=False, k=1., usemysoftmil=False, softmink=1., softmaxk=1.,\
    sparsemil=False, sparsemill1=0., sparsemill2=0., saveact=False):
    l1factor = l1
    l2factor = l2
    if heatmap:
        inputs = Input(shape=(3,None,None))
    else:
        inputs = Input(shape=(3,227,227))

    conv_1 = Convolution2D(96, 11, 11,subsample=(4,4),activation='relu', W_regularizer=l1l2(l1=l1factor, l2=l2factor),
                           name='conv_1')(inputs)

    conv_2 = MaxPooling2D((3, 3), strides=(2,2))(conv_1)
    conv_2 = crosschannelnormalization(name="convpool_1")(conv_2)
    conv_2 = ZeroPadding2D((2,2))(conv_2)
    conv_2 = merge([
        Convolution2D(128,5,5,activation="relu",name='conv_2_'+str(i+1), W_regularizer=l1l2(l1=l1factor, l2=l2factor))(
            splittensor(ratio_split=2,id_split=i)(conv_2)
        ) for i in range(2)], mode='concat',concat_axis=1,name="conv_2")

    conv_3 = MaxPooling2D((3, 3), strides=(2, 2))(conv_2)
    conv_3 = crosschannelnormalization()(conv_3)
    conv_3 = ZeroPadding2D((1,1))(conv_3)
    conv_3 = Convolution2D(384,3,3,activation='relu',name='conv_3', W_regularizer=l1l2(l1=l1factor, l2=l2factor))(conv_3)

    conv_4 = ZeroPadding2D((1,1))(conv_3)
    conv_4 = merge([
        Convolution2D(192,3,3,activation="relu",name='conv_4_'+str(i+1), W_regularizer=l1l2(l1=l1factor, l2=l2factor))(
            splittensor(ratio_split=2,id_split=i)(conv_4)
        ) for i in range(2)], mode='concat',concat_axis=1,name="conv_4")

    conv_5 = ZeroPadding2D((1,1))(conv_4)
    conv_5 = merge([
        Convolution2D(128,3,3,activation="relu",name='conv_5_'+str(i+1), W_regularizer=l1l2(l1=l1factor, l2=l2factor))(
            splittensor(ratio_split=2,id_split=i)(conv_5)
        ) for i in range(2)], mode='concat',concat_axis=1,name="conv_5")

    dense_1 = MaxPooling2D((3, 3), strides=(2,2),name="convpool_5")(conv_5)

    if heatmap:
        dense_1 = Convolution2D(4096,6,6,activation="relu",name="dense_1",W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_2 = Convolution2D(4096,1,1,activation="relu",name="dense_2",W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_3 = Convolution2D(outdim, 1,1,name="dense_3",W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_2)
        prediction = Softmax4D(axis=1,name="softmax")(dense_3)
    elif usemil:
        dense_1 = Convolution2D(128,1,1,activation='relu',name='mil_1',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_2 = Convolution2D(128,1,1,activation='relu',name='mil_2',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_3 = Convolution2D(outdim,1,1,name='mil_3',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_2)
        prediction_1 = Softmax4D(axis=1, name='softmax')(dense_3)
        #prediction = Flatten(name='flatten')(prediction_1)
        #dense_3 = Dense(outdim,name='dense_3',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(prediction)
        #prediction = Activation("softmax",name="softmax2")(dense_3)

        prediction_1 = MaxPooling2D((6,6), name='output')(prediction_1)
        prediction = Flatten(name='flatten')(prediction_1)
        prediction = Recalc(axis=1, name='Recalcmil')(prediction)
    elif usemymil:
        dense_1 = Convolution2D(128,1,1,activation='relu',name='mil_1',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_2 = Convolution2D(128,1,1,activation='relu',name='mil_2',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_3 = Convolution2D(1,1,1,activation='sigmoid',name='mil_3',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_2)
        #prediction_1 = Softmax4D(axis=1, name='softmax')(dense_3)
        #prediction = ExtractDim(axis=1, name='extract')(prediction_1)
        prediction = Flatten(name='flatten')(dense_3)
        prediction = ReRank(k=k, label=1, name='output')(prediction)
    elif usemysoftmil:
        dense_1 = Convolution2D(128,1,1,activation='relu',name='mil_1',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_2 = Convolution2D(128,1,1,activation='relu',name='mil_2',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_3 = Convolution2D(1,1,1,activation='sigmoid',name='mil_3',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_2)
        #prediction_1 = Softmax4D(axis=1, name='softmax')(dense_3)
        #prediction = ExtractDim(axis=1, name='extract')(prediction_1)
        prediction = Flatten(name='flatten')(dense_3)
        prediction = SoftReRank(softmink=softmink, softmaxk=softmaxk, label=1, name='output')(prediction)
    elif sparsemil:
        dense_1 = Convolution2D(128,1,1,activation='relu',name='mil_1',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_2 = Convolution2D(128,1,1,activation='relu',name='mil_2',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        prediction_1 = Convolution2D(1,1,1,activation='sigmoid',name='mil_3',W_regularizer=l1l2(l1=l1factor, l2=l2factor),\
            activity_regularizer=activity_l1l2(l1=sparsemill1, l2=sparsemill2))(dense_2)
#        prediction_1 = Softmax4D(axis=1, name='softmax')(prediction_1)
        #dense_3 = Convolution2D(outdim,1,1,name='mil_3',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_2)
        #prediction_1 = Softmax4D(axis=1, name='softmax')(dense_3)
        #prediction_1 = ActivityRegularizerOneDim(l1=sparsemill1, l2=sparsemill2)(prediction_1)
        #prediction = MaxPooling2D((6,6), name='output')(prediction_1)
#        prediction_1 = Convolution2D(1, 3, 3, activation='sigmoid', border_mode='same', name='smooth', \
#            W_regularizer=l1l2(l1=l1factor, l2=l2factor), activity_regularizer=activity_l1l2(l1=sparsemill1, l2=sparsemill2))(prediction_1)
        prediction = Flatten(name='flatten')(prediction_1)
        if saveact:
          model = Model(input=inputs, output=prediction)
          return model
        prediction = RecalcExpand(axis=1, name='Recalcmil')(prediction)
    else:
        dense_1 = Flatten(name="flatten")(dense_1)
        dense_1 = Dense(4096, activation='relu',name='dense_1',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_1)
        dense_2 = Dropout(0.5)(dense_1)
        dense_2 = Dense(4096, activation='relu',name='dense_2',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_2)
        dense_3 = Dropout(0.5)(dense_2)
        dense_3 = Dense(outdim,name='dense_3',W_regularizer=l1l2(l1=l1factor, l2=l2factor))(dense_3)
        prediction = Activation("softmax",name="softmax")(dense_3)

    model = Model(input=inputs, output=prediction)

    if weights_path:
        model.load_weights(weights_path)

    return model

Answer 1

这是因为输入形状的格式与当前配置的通道排序不匹配。似乎您正在使用TensorFlow后端，默认使用＆＃34;频道最后＆＃34;，而您的输入形状采用格式＆＃34;频道优先＆＃34;。

只需将输入形状更改为（227,227,3）和相应的数据。

Keras module.summary（）结果 - 输出形状中的负维度值＆＃39;

1 个答案: