我正在尝试重现DeepCoder项目(请参阅https://arxiv.org/abs/1611.01989)发布的结果,特别是其神经网络组件。
简要概述:
DeepCoder的前馈神经网络模型是多标签多类分类器,在给定黑箱函数的输入和输出列表的情况下,该模型将预测函数向量。
例如假设函数集分别为[+-* /]和输入输出集[0,1,2,3]-> [1,2,3,4],则正确的预测可能是[1,0,0, 0](即黑盒函数包含但不限于a +的预测)。
Here's a graph view of my implemented Neural Network
已实施模型的明细:
给出2(N,1)个输入张量(分别对应于输入和输出样本),NN逐行拆分张量,使每个子张量通过嵌入层,连接输入和输出嵌入,然后将每个N个复合嵌入都堆叠到单个张量中。请注意,根据DeepCoder论文中的规范,N对应于每个黑盒功能的样本数。然后将张量通过具有256个ReLU / softmax单元的3个隐藏层(均已测试),然后在最后一层将其平均为一维数组。训练神经网络将样本分类为(x + x)或(x-x)时,预测总是得出相同的结果(通常是训练后的最后一个函数) 。我还用(x / x),(x * x),cos(x),sin(x)和sqrt(x)进行了测试,结果相似(错误)。
问题:拥有神经网络设计经验的人可以发现我的NN布局中的任何缺点吗?可能与DeepCoder的规格有偏差吗?
这是我的代码,是用Keras(带有Tensorflow后端)编写的:
hidden_size = 256
inputs_per_sample = 1
n_attribs = 7
samples_per_iter = 5
# custom functions for lambdas
def split(x, i):
return x[0, i]
def average(x):
ave = K.mean(x, axis=0, keepdims=True)
return ave
def get_model(inputs_per_sample, embedding_dimension, samples_per_prog):
inp_tens = Input(shape=(samples_per_prog, inputs_per_sample,), name="INPUT")
out_tens = Input(shape=(samples_per_prog, 1,), name="OUTPUT")
# formats are (None, spp, sps)... The None represents the number of samples that can be fed *incrementally* (i.e. isolated)
inp_embedding = Embedding(50 + 1, embedding_dimension, input_length=inputs_per_sample, name='INPUT_EMBEDDER')
out_embedding = Embedding(512 + 1, embedding_dimension, input_length=1, name='OUTPUT_EMBEDDER')
l1 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_1')
l2 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_2')
l3 = Dense(hidden_size, activation='relu', name='HIDDEN_LAYER_3')
decode = Dense(n_attribs, activation='softmax', name="DECODER")
combine = Concatenate(name='COMBINATOR')
stack = Concatenate(axis=0, name='STACKER')
reshape = Reshape((-1,), name='RESHAPER')
output_tensor = []
for i in range(samples_per_prog):
input_tens_inter = Lambda(split, arguments={'i': i}, name='INP_SPLITTER_' + str(i))(inp_tens)
outpt_tens_inter = Lambda(split, arguments={'i': i}, name='OUT_SPLITTER_' + str(i))(out_tens)
tens_inter = combine([reshape(inp_embedding(input_tens_inter)), reshape(out_embedding(outpt_tens_inter))])
if i == 0:
output_tensor = tens_inter
else:
output_tensor = stack([tens_inter, output_tensor])
l1_layers = Dropout(0.2, name='DROPOUT_1')(l1(output_tensor))
l2_layers = Dropout(0.2, name='DROPOUT_2')(l2(l1_layers))
l3_layers = Dropout(0.2, name='DROPOUT_3')(l3(l2_layers))
decoder = decode(l3_layers)
pooled_output = Lambda(average, name='AVERAGE_POOL', output_shape=(n_attribs,))(decoder)
print(pooled_output.get_shape())
model = Model(inputs=[inp_tens, out_tens], outputs=pooled_output)
model.compile('adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
model = get_model(inputs_per_sample, 20, samples_per_iter)
print(model.summary())
#plot_model(model, to_file='ml_graph.png', show_shapes=True)
inputs = pd.read_csv("G:\\s_is.csv", header=None)
outputs = pd.read_csv("G:\\s_os.csv", header=None)
avs = pd.read_csv("G:\\s_avs.csv", header=None)
ins = inputs
out = outputs
attribs = avs
for j in range(len(attribs)):
print("j=" + str(j))
_inps = np.array(ins[j * samples_per_iter:(j + 1) * samples_per_iter])
_outs = np.array(out[j * samples_per_iter:(j + 1) * samples_per_iter])
_inps = np.reshape(_inps, (1, samples_per_iter, 1))
_outs = np.reshape(_outs, (1, samples_per_iter, 1))
_atts = np.array([attribs.T.get(j)])
model.fit(x=[_inps, _outs], y=_atts, batch_size=10, epochs=1000)
# saving the model
model_json = model.to_json()
with open("C:\\Users\\James-MSI\\Desktop\\model.json", "w") as json_file:
json_file.write(model_json)
model.save_weights("C:\\Users\\James-MSI\\Desktop\\modelweights.h5")
谢谢:)
詹姆斯