TensorFlow:排队的排队元组抛出TypeError

时间:2017-01-13 17:22:04

标签: python queue tensorflow

我有一个基于时间的数据集,其中包括分类特征,实际值特征,一个掩盖当前是否存在给定特征的掩码,以及一个" Deltas"包含自值出现以来的时间长度的数组。

我想构建这些张量的元组队列,以便将分类特征转换为热特征,以便数据,掩码和增量可以在模型的不同部分中使用。下面是我写的一些代码:

import tensorflow as tf
import threading
import numpy as np

# Function to generate batches of data
def nextBatch(batch_size):
    n_steps = 14
    batch = []
    for _ in range(batch_size):
        # Create tuple of tensors
        ex = (np.random.randint(0,5, (n_steps, 2)),
              np.random.randn(n_steps, 10),
              np.random.randint(0,2, (n_steps, 12)),
              np.random.randint(0,2000, (n_steps, 12)))
        batch.append(ex)

    return batch


# Graph to enqueue data
tf.reset_default_graph()

q = tf.PaddingFIFOQueue(1000,
                        [np.uint16, tf.float32, tf.uint16, tf.uint16],
                        [(None,5), (None,48), (None,53), (None,53)])

def enqueue_op():
    # Stop enqueuing after 11 ops
    i = 0
    while True:
        q.enqueue_many(nextBatch(100))
        i += 1
        if i >11:
            return      

# Start enqueuing
t = threading.Thread(target=enqueue_op)
t.start()

当我运行它时,我得到一个TypeError:

TypeError:预期的uint16,得到类型' ndarray'的数组(...)代替。

我不确定我做错了什么,是否在创建队列时是dtype定义?

1 个答案:

答案 0 :(得分:2)

这里有一些问题:

  1. 您的帖子反复拨打q.enqueue_many()。尽管名称(稍微令人困惑),q.enqueue_many()方法不会立即将队列中的数据排入队列,而是返回tf.Operation,必须将sess.run()传递给nextBatch(100)以添加张量中的张量。队列。在单独的线程中运行的代码创建10个enqueue-many操作并丢弃它们,这可能不是你想要的。

  2. q.enqueue_many()的返回值是包含4个数组的100个元组的列表。 q.enqueue()方法需要4个数组的元组。如果你要列出100个元组的列表,你需要运行nextBatch() op 100次,将每个元组组件的100个数组叠加在一起,这样你才能拥有一个元组四个阵列的元组。

  3. n_steps中生成的数组与队列组件的形状不匹配。假设(n_steps, 5)是可变的维度(用于填充),该函数应生成(n_steps, 48)(n_steps, 53)(n_steps, 53)和{{1}的数组匹配队列定义。

  4. 以下是您认为符合要求的代码版本:

    import tensorflow as tf
    import threading
    import numpy as np
    
    # Function to generate batches of data                                                                                                                         
    def nextBatch(batch_size):
      n_steps = 14
      batch = []
      for _ in range(batch_size):
        # Create tuple of tensors                                                                                                                                  
        ex = (np.random.randint(0,5, (n_steps, 5)),
              np.random.randn(n_steps, 48),
              np.random.randint(0,2, (n_steps, 53)),
              np.random.randint(0,2000, (n_steps, 53)))
        batch.append(ex)
      return batch
    
    q = tf.PaddingFIFOQueue(1000,
                            [tf.uint16, tf.float32, tf.uint16, tf.uint16],
                            [(None, 5), (None, 48), (None, 53), (None, 53)])
    
    # Define a single op for enqueuing a tuple of placeholder tensors.
    placeholders = [tf.placeholder(tf.uint16, shape=(None, 5)),
                    tf.placeholder(tf.float32, shape=(None, 48)),
                    tf.placeholder(tf.uint16, shape=(None, 53)),
                    tf.placeholder(tf.uint16, shape=(None, 53))]
    enqueue_op = q.enqueue(placeholders)
    
    # Create a session in order to run the enqueue_op.
    sess = tf.Session()
    
    def enqueue_thread_fn():
      for i in range(10):
        batch = nextBatch(100)
        for batch_elem in batch:
          # Each call to `sess.run(enqueue_op, ...)` enqueues a single element in
          # the queue.
          sess.run(enqueue_op, feed_dict={placeholders[0]: batch_elem[0],
                                          placeholders[1]: batch_elem[1],
                                          placeholders[2]: batch_elem[2],
                                          placeholders[3]: batch_elem[3]})
    
    # Start enqueuing                                                                                                                                              
    t = threading.Thread(target=enqueue_thread_fn)
    t.start()
    t.join()
    sess.close()