在Tensorflow上构建Rnn,用于概念漂移的数据集

时间:2018-02-08 07:19:33

标签: tensorflow rnn

我正在建立一个RNN模型来预测恶意软件的恶意程度。数据集的格式为[malicisouness(0-1),[features*480]],功能非常稀疏,有18个月的数据,按月排序。

我正在尝试从每个月输入一个数据条目,并按时间顺序输入18个数据条目。我希望RNN输出最近一个月(第18个月)的恶意程度并根据它计算损失。

以下是我正在使用的代码,但我无法正确获得输入和张量的形状。

n_steps = 18
n_inputs = 480
n_neurons = 100
n_outputs = 1
n_epochs = 20

batch_size = 50
learning_rate = 0.01

test_Y = np.empty([1065])
train_Y = np.empty([1065])

for i in range(17, len(testY), 18):
    np.append(test_Y, testY[i])
    np.append(train_Y, trainY[i])

with tf.name_scope("Variable"):
    X = tf.placeholder(tf.float32, [None, n_steps, n_inputs], name = "X")
    y = tf.placeholder(tf.float32, [None], name = "Y")

    weights = tf.Variable(tf.random_normal([n_steps, n_outputs]))
    bias = tf.Variable(tf.random_normal([n_outputs]))

with tf.name_scope("RNN"):
    lstm_cell = tf.contrib.rnn.LSTMCell(num_units = n_neurons, use_peepholes = True)
    rnn_outputs, states = tf.nn.dynamic_rnn(lstm_cell, X, dtype=tf.float32)
    stacked_rnn_outputs = tf.reshape(rnn_outputs, [-1, n_neurons])
    stacked_outputs = tf.layers.dense(stacked_rnn_outputs, n_outputs,name = "reshape")
    outputs = tf.reshape(stacked_outputs, [-1, n_steps])
    out = tf.matmul(outputs, weights) + bias
    out = tf.unstack(out, axis = 1)

with tf.name_scope("cost"):
    loss = tf.reduce_mean(tf.abs(y-out))

with tf.name_scope('train'):
    optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate,name = "optimizer")
    training_op = optimizer.minimize(loss)

    init = tf.global_variables_initializer()   

with tf.Session() as sess:
    init.run()
    n = trainX.shape[0]
    for epoch in range(n_epochs):
        train_cost = 0
        for start, end in zip(range(0, n, batch_size*n_steps),     range(batch_size*n_steps, n, batch_size*n_steps)):
            y_start = int(start/(batch_size*n_steps))
            y_end = int(y_start + batch_size)
            X_batch, y_batch = trainX[start:end], train_Y[y_start:y_end]
            X_batch = X_batch.reshape((-1, n_steps, n_inputs))
            _, l = sess.run([training_op,loss], feed_dict = {X: X_batch, y: y_batch})
            train_cost += l

        print(epoch, "Train cost:", train_cost/(n//batch_size))

此rnn的输出是:
    0火车费用:南郎     1列车费用:南郎     2列车费用:南郎

显然,输入输入不正确,但我不知道如何做正确的事。

0 个答案:

没有答案