Keras:价值错误|在转移预训练模型权重

时间:2017-12-19 11:15:29

标签: tensorflow deep-learning keras data-science keras-layer

我已经预先训练了一个模型并保存。现在我想将它用于另一个数据集的训练测试目的。有点像转学习。

我的模型架构如下所示。

 model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_1 (Conv1D)            (None, 298, 32)           128       
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 149, 32)           0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 4768)              0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 4768)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               610432    
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 516       
=================================================================
Total params: 611,076
Trainable params: 611,076
Non-trainable params: 0
_________________________________________________________________

所以,现在当我加载这个模型来训练测试另一个模型时,我收到了一些错误。我不确定。我在下面给我的程序。

# -*- coding: utf-8 -*-
"""Created on Mon Dec 18 23:18:36 2017
@author: Md Junayed Hasan
"""

# PART 1 - Data preprocessing
import numpy as np 
import pandas as pd 

# importing the dataset
# Description: Dataset has row : 5969 and Column : 301
dataset = pd.read_csv('dataset.csv')
# Independent Variables
# In x taking cloumn till 300
x = dataset.iloc[:, :-1].values
# Dependent Variables
y = dataset.iloc[:, 300:301].values 


# Encoding categorical data
from sklearn.preprocessing import LabelEncoder
labelencoder = LabelEncoder()
y[:, 0] = labelencoder.fit_transform(y[:, 0])
classes = list(labelencoder.classes_)


# split train data into train and validation
from sklearn.cross_validation import train_test_split
x_train,x_valid, y_train, y_valid = train_test_split(x,y, test_size=0.8, random_state=23) 


# Feature Scaling / Normalization / Standardization
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
x_train = sc_x.fit_transform(x_train)
x_valid = sc_x.fit_transform(x_valid)


# Initializers
nb_filters = 32 # number of feature detector
nb_kernel_size = 3
nb_strides_conv = 1
nb_features = len(np.transpose(x_train)) # sample size
nb_total_samples = len(x_train) # sample number
nb_pool_size = 2
nb_strides_pool = 2
nb_dropout = 0.1
nb_labels = len(classes)
nb_epochs = 2 # for example
nb_batch_size = 15
nb_channels = 1


# 1 D CNN converting DATA into 3D tensor is must

x_train_r=  np.reshape(x_train,(len(x_train),nb_features,nb_channels))
x_valid_r=  np.reshape(x_valid,(len(x_valid),nb_features,nb_channels))


# CNN Func
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD 
from keras.models import load_model

def CNN_1D(nb_channels,nb_labels,x,y,x_valid_r,y_valid):
    model = Sequential()
    model.add(Dense(128,input_dim =4768, activation = 'relu')) 
    model.add(Dense(32, activation = 'relu' ))
    model.add(Dense(nb_labels, activation = 'softmax'))
    model.load_weights('Scenario1- WC1-2 - WEIGHTS.h5')

    sgd = SGD(lr=0.01, nesterov=True, decay=1e-6, momentum=0.9)

    model.compile(loss='sparse_categorical_crossentropy',optimizer=sgd,metrics=['accuracy'])
    

    y_pred = model.predict(x)
    y_pred = (y_pred > 0.5)
    

    from sklearn.metrics import confusion_matrix
    cm = confusion_matrix(y_valid, y_pred)
    
    print("FINISHED")
    pass

# PART 3 - 1D CNN Function Call
CNN_1D(nb_channels,nb_labels,x,y,x_valid_r,y_valid)

获取输出时,我收到此错误。

ValueError: Shapes must be equal rank, but are 2 and 3 for 'Assign' (op: 'Assign') with input shapes: [4768,128], [3,1,32].

我还详细介绍了网络架构和数据集架构。请帮我解决一下。

1 个答案:

答案 0 :(得分:1)

您描述的两种型号都完全不同。

虽然您的模型具有卷积图层,合并图层等,但您尝试将权重转移到仅由密集图层构成的模型。

这根本不可能。

唯一可以获得这些权重的模型是:

model = Sequential()
model.add(Conv1D(32,
                 kernel_size,
                 input_shape=(inputLength,inputChannels), 
                 activation = any
                 padding=validOrSame)) 
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(128, activation = any))
model.add(Dense(4, activation = any))

model.load_weights('Scenario1- WC1-2 - WEIGHTS.h5')

通过摘要中的计算,您可以拥有以下组合:

  • kernel_size=3; inputChannels=1
  • kernel_size=1; inputChannels=3

上述选项中只有一个可以使用。

并且:

  • nputLength=300; validOrSame='valid'
  • inputLength=298; validOrSame='same'

只有您可能知道您的原始模型决定了这些变量。

转移部分重量

如果您创建与权重兼容的模型并将其称为oldModel

oldModel.load_weights('Scenario1- WC1-2 - WEIGHTS.h5')

然后创建新模型,将最后一个Dense图层从4个单位更改为6个单位,并将其称为newModel

拥有两个模型后,您可以转移权重:

for i in range(len(oldModel.layers)-1): #loop excluding the last layer
    newModel.layers[i].set_weights(oldModel.layers[i].get_weights())