
时间:2017-01-26 17:30:54

标签: tensorflow

我使用numpy创建了一个滑动窗口算法,该算法在wav音频文件上滑动并将其切片输入到我的NN中,在tensorflow中检测音频切片中的功能。一旦张量流完成它,它就会将其输出返回到numpy land,在那里我将切片重新组合成一个与原始文件的每个样本位置匹配的预测数组:

import tensorflow as tf
import numpy as np
import nn

def slide_predict(layers, X, modelPath):
    output = None

    graph = tf.Graph()
    with graph.as_default():
        input_layer_size, hidden_layer_size, num_labels = layers

        X_placeholder = tf.placeholder(tf.float32, shape=(None, input_layer_size), name='X')
        Theta1 = tf.Variable(nn.randInitializeWeights(input_layer_size, hidden_layer_size), name='Theta1')
        bias1 = tf.Variable(nn.randInitializeWeights(hidden_layer_size, 1), name='bias1')
        Theta2 = tf.Variable(nn.randInitializeWeights(hidden_layer_size, num_labels), name='Theta2')
        bias2 = tf.Variable(nn.randInitializeWeights(num_labels, 1), name='bias2')
        hypothesis = nn.forward_prop(X_placeholder, Theta1, bias1, Theta2, bias2)

        sess = tf.Session(graph=graph)
        saver = tf.train.Saver()
        init = tf.global_variables_initializer()

        saver.restore(sess, modelPath)

        window_size = layers[0]

        pad_amount = (window_size * 2) - (X.shape[0] % window_size)
        X = np.pad(X, (pad_amount, 0), 'constant')

        for w in range(window_size):
            start = w
            end = -window_size + w
            X_shifted = X[start:end]
            X_matrix = X_shifted.reshape((-1, window_size))

            prediction = sess.run(hypothesis, feed_dict={X_placeholder: X_matrix})

            output = prediction if (output is None) else np.hstack((output, prediction))


    output.shape = (X.size, -1)

    return output




1 个答案:

答案 0 :(得分:7)


for w in range(window_size):
    # ...
    output = prediction if (output is None) else np.hstack((output, prediction))


output_list = []
for w in range(window_size):
    # ...
    prediction = sess.run(...)
output = np.hstack(output_list)

第二个问题是,如果sess.run()调用中的计算量很小,那么向TensorFlow提供大值可能效率低,因为这些值(当前)被复制到C ++中(结果被复制出来)一个有用的策略是尝试使用tf.map_fn()构造将滑动窗口循环移动到TensorFlow图形中。例如,您可以按如下方式重构程序:

# NOTE: If you call this function often, you may want to (i) move the `np.pad()`
# into the graph as `tf.pad()`, and (ii) replace `X_t` with a placeholder.
X = np.pad(X, (pad_amount, 0), 'constant')
X_t  = tf.convert_to_tensor(X)

def window_func(w):
    start = w
    end = w - window_size
    X_matrix = tf.reshape(X_t[start:end], (-1, window_size))
    return nn.forward_prop(X_matrix, Theta1, bias1, Theta2, bias2)

output_t = tf.map_fn(window_func, tf.range(window_size))
# ...
output = sess.run(output_t)