Question

我有一个基于时间的数据集，其中包括分类特征，实际值特征，一个掩盖当前是否存在给定特征的掩码，以及一个＆＃34; Deltas＆＃34;包含自值出现以来的时间长度的数组。

我想构建这些张量的元组队列，以便将分类特征转换为热特征，以便数据，掩码和增量可以在模型的不同部分中使用。下面是我写的一些代码：

import tensorflow as tf
import threading
import numpy as np

# Function to generate batches of data
def nextBatch(batch_size):
    n_steps = 14
    batch = []
    for _ in range(batch_size):
        # Create tuple of tensors
        ex = (np.random.randint(0,5, (n_steps, 2)),
              np.random.randn(n_steps, 10),
              np.random.randint(0,2, (n_steps, 12)),
              np.random.randint(0,2000, (n_steps, 12)))
        batch.append(ex)

    return batch


# Graph to enqueue data
tf.reset_default_graph()

q = tf.PaddingFIFOQueue(1000,
                        [np.uint16, tf.float32, tf.uint16, tf.uint16],
                        [(None,5), (None,48), (None,53), (None,53)])

def enqueue_op():
    # Stop enqueuing after 11 ops
    i = 0
    while True:
        q.enqueue_many(nextBatch(100))
        i += 1
        if i >11:
            return      

# Start enqueuing
t = threading.Thread(target=enqueue_op)
t.start()

当我运行它时，我得到一个TypeError：

TypeError：预期的uint16，得到类型＆＃39; ndarray＆＃39;的数组（...）代替。

我不确定我做错了什么，是否在创建队列时是dtype定义？

Answer 1

这里有一些问题：

您的帖子反复拨打q.enqueue_many()。尽管名称（稍微令人困惑），q.enqueue_many()方法不会立即将队列中的数据排入队列，而是返回tf.Operation，必须将sess.run()传递给nextBatch(100)以添加张量中的张量。队列。在单独的线程中运行的代码创建10个enqueue-many操作并丢弃它们，这可能不是你想要的。
q.enqueue_many()的返回值是包含4个数组的100个元组的列表。 q.enqueue()方法需要4个数组的元组。如果你要列出100个元组的列表，你需要运行nextBatch() op 100次，或将每个元组组件的100个数组叠加在一起，这样你才能拥有一个元组四个阵列的元组。
n_steps中生成的数组与队列组件的形状不匹配。假设(n_steps, 5)是可变的维度（用于填充），该函数应生成(n_steps, 48)，(n_steps, 53)，(n_steps, 53)和{{1}的数组匹配队列定义。

以下是您认为符合要求的代码版本：

import tensorflow as tf
import threading
import numpy as np

# Function to generate batches of data                                                                                                                         
def nextBatch(batch_size):
  n_steps = 14
  batch = []
  for _ in range(batch_size):
    # Create tuple of tensors                                                                                                                                  
    ex = (np.random.randint(0,5, (n_steps, 5)),
          np.random.randn(n_steps, 48),
          np.random.randint(0,2, (n_steps, 53)),
          np.random.randint(0,2000, (n_steps, 53)))
    batch.append(ex)
  return batch

q = tf.PaddingFIFOQueue(1000,
                        [tf.uint16, tf.float32, tf.uint16, tf.uint16],
                        [(None, 5), (None, 48), (None, 53), (None, 53)])

# Define a single op for enqueuing a tuple of placeholder tensors.
placeholders = [tf.placeholder(tf.uint16, shape=(None, 5)),
                tf.placeholder(tf.float32, shape=(None, 48)),
                tf.placeholder(tf.uint16, shape=(None, 53)),
                tf.placeholder(tf.uint16, shape=(None, 53))]
enqueue_op = q.enqueue(placeholders)

# Create a session in order to run the enqueue_op.
sess = tf.Session()

def enqueue_thread_fn():
  for i in range(10):
    batch = nextBatch(100)
    for batch_elem in batch:
      # Each call to `sess.run(enqueue_op, ...)` enqueues a single element in
      # the queue.
      sess.run(enqueue_op, feed_dict={placeholders[0]: batch_elem[0],
                                      placeholders[1]: batch_elem[1],
                                      placeholders[2]: batch_elem[2],
                                      placeholders[3]: batch_elem[3]})

# Start enqueuing                                                                                                                                              
t = threading.Thread(target=enqueue_thread_fn)
t.start()
t.join()
sess.close()

TensorFlow：排队的排队元组抛出TypeError

1 个答案: