使用Keras再现DeepCoder结果的问题

时间:2018-07-22 22:03:07

标签: tensorflow neural-network keras deep-learning multilabel-classification

我正在尝试重现DeepCoder项目(请参阅https://arxiv.org/abs/1611.01989)发布的结果,特别是其神经网络组件。

简要概述:

DeepCoder的前馈神经网络模型是多标签多类分类器,在给定黑箱函数的输入和输出列表的情况下,该模型将预测函数向量。

例如假设函数集分别为[+-* /]和输入输出集[0,1,2,3]-> [1,2,3,4],则正确的预测可能是[1,0,0, 0](即黑盒函数包含但不限于a +的预测)。

Here's a graph view of my implemented Neural Network

已实施模型的明细:

给出2(N,1)个输入张量(分别对应于输入和输出样本),NN逐行拆分张量,使每个子张量通过嵌入层,连接输入和输出嵌入,然后将每个N个复合嵌入都堆叠到单个张量中。请注意,根据DeepCoder论文中的规范,N对应于每个黑盒功能的样本数。然后将张量通过具有256个ReLU / softmax单元的3个隐藏层(均已测试),然后在最后一层将其平均为一维数组。

训练神经网络将样本分类为(x + x)或(x-x)时,预测总是得出相同的结果(通常是训练后的最后一个函数) 。我还用(x / x),(x * x),cos(x),sin(x)和sqrt(x)进行了测试,结果相似(错误)。

问题:拥有神经网络设计经验的人可以发现我的NN布局中的任何缺点吗?可能与DeepCoder的规格有偏差吗?

这是我的代码,是用Keras(带有Tensorflow后端)编写的:

hidden_size = 256
inputs_per_sample = 1
n_attribs = 7
samples_per_iter = 5

# custom functions for lambdas
def split(x, i):
    return x[0, i]
def average(x):
    ave = K.mean(x, axis=0, keepdims=True)
    return ave

def get_model(inputs_per_sample, embedding_dimension, samples_per_prog):
    inp_tens = Input(shape=(samples_per_prog, inputs_per_sample,), name="INPUT")
    out_tens = Input(shape=(samples_per_prog, 1,), name="OUTPUT")
    # formats are (None, spp, sps)... The None represents the number of samples that can be fed *incrementally* (i.e. isolated)

    inp_embedding = Embedding(50 + 1, embedding_dimension, input_length=inputs_per_sample, name='INPUT_EMBEDDER')
    out_embedding = Embedding(512 + 1, embedding_dimension, input_length=1, name='OUTPUT_EMBEDDER')
    l1 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_1')
    l2 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_2')
    l3 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_3')
    decode = Dense(n_attribs, activation='softmax', name="DECODER")
    combine = Concatenate(name='COMBINATOR')
    stack = Concatenate(axis=0, name='STACKER')
    reshape = Reshape((-1,), name='RESHAPER')

    output_tensor = []
    for i in range(samples_per_prog):
        input_tens_inter = Lambda(split, arguments={'i': i}, name='INP_SPLITTER_' + str(i))(inp_tens)
        outpt_tens_inter = Lambda(split, arguments={'i': i}, name='OUT_SPLITTER_' + str(i))(out_tens)
        tens_inter = combine([reshape(inp_embedding(input_tens_inter)), reshape(out_embedding(outpt_tens_inter))])
        if i == 0:
            output_tensor = tens_inter
        else:
            output_tensor = stack([tens_inter, output_tensor])

    l1_layers = Dropout(0.2, name='DROPOUT_1')(l1(output_tensor))
    l2_layers = Dropout(0.2, name='DROPOUT_2')(l2(l1_layers))
    l3_layers = Dropout(0.2, name='DROPOUT_3')(l3(l2_layers))

    decoder = decode(l3_layers)
    pooled_output = Lambda(average, name='AVERAGE_POOL', output_shape=(n_attribs,))(decoder)
    print(pooled_output.get_shape())

    model = Model(inputs=[inp_tens, out_tens], outputs=pooled_output)
    model.compile('adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model


model = get_model(inputs_per_sample, 20, samples_per_iter)
print(model.summary())
#plot_model(model, to_file='ml_graph.png', show_shapes=True)

inputs = pd.read_csv("G:\\s_is.csv", header=None)
outputs = pd.read_csv("G:\\s_os.csv", header=None)
avs = pd.read_csv("G:\\s_avs.csv",  header=None)

ins = inputs
out = outputs
attribs = avs

for j in range(len(attribs)):
    print("j=" + str(j))
    _inps = np.array(ins[j * samples_per_iter:(j + 1) * samples_per_iter])
    _outs = np.array(out[j * samples_per_iter:(j + 1) * samples_per_iter])
    _inps = np.reshape(_inps, (1, samples_per_iter, 1))
    _outs = np.reshape(_outs, (1, samples_per_iter, 1))
    _atts = np.array([attribs.T.get(j)])

    model.fit(x=[_inps, _outs], y=_atts, batch_size=10, epochs=1000)

# saving the model
model_json = model.to_json()
with open("C:\\Users\\James-MSI\\Desktop\\model.json", "w") as json_file:
    json_file.write(model_json)
model.save_weights("C:\\Users\\James-MSI\\Desktop\\modelweights.h5")

谢谢:)

詹姆斯

0 个答案:

没有答案