我尝试在tensorflow中读取一些数据,然后将其与其标签进行匹配。我的设置如下:
"a", "b", "c", "d", "e", ...
"a", "b", "w, "g", "d", ...
,0, 1, 2, 3, 4, ...
我想创建一个包含前两个数组之间对的示例队列,如["b", "b"], ["d", "g"], ["c", "w"], ...
。我还想要一个相应数字的队列到这些对,在这种情况下将是1, 3, 2, ...
但是,当我生成这些队列时,我的示例和我的数字不匹配 - 例如,["b", "b"], ["d", "g"], ["c", "w"], ...
的队列与5, 0, 2, ...
的标签队列一起出现。
可能导致这种情况的原因是什么?为了进行测试,我已禁用队列/批处理生成中的所有混洗,但问题仍然存在。
这是我的代码:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import tensorflow as tf
from constants import FLAGS
letters_data = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"]
cyrillic_letters_data = ["a", "b", "w", "g", "d", "e", "j", "v", "z", "i"]
numbers_data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
def inputs(batch_size):
# Get the letters and the labels
(letters, labels) = _inputs(batch_size=batch_size)
# Return the letters and the labels.
return letters, labels
def read_letter(pairs_and_overlap_queue):
# Read the letters, cyrillics, and numbers.
letter = pairs_and_overlap_queue[0]
cyrillic = pairs_and_overlap_queue[1]
number = pairs_and_overlap_queue[2]
# Do something with them
# (doesn't matter what)
letter = tf.substr(letter, 0, 1)
cyrillic = tf.substr(cyrillic, 0, 1)
number = tf.add(number, tf.constant(0))
# Return them
return letter, cyrillic, number
def _inputs(batch_size):
# Get the input data
letters = letters_data
cyrillics = cyrillic_letters_data
numbers = numbers_data
# Create a queue containing the letters,
# the cyrillics, and the numbers
pairs_and_overlap_queue = tf.train.slice_input_producer([letters, cyrillics, numbers],
capacity=100000,
shuffle=False)
# Perform some operations on each of those
letter, cyrillic, number = read_letter(pairs_and_overlap_queue)
# Combine the letters and cyrillics into one example
combined_example = tf.stack([letter, cyrillic])
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(FLAGS.NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
min_fraction_of_examples_in_queue)
# Generate an example and label batch, and return it.
return _generate_image_and_label_batch(example=combined_example, label=number,
min_queue_examples=min_queue_examples,
batch_size=batch_size,
shuffle=False)
def _generate_image_and_label_batch(example, label, min_queue_examples,
batch_size, shuffle):
# Create a queue that shuffles the examples, and then
# read 'batch_size' examples + labels from the example queue.
num_preprocess_threads = FLAGS.NUM_THREADS
if shuffle:
examples, label_batch = tf.train.shuffle_batch(
[example, label],
batch_size=batch_size,
num_threads=num_preprocess_threads,
capacity=min_queue_examples + 6 * batch_size,
min_after_dequeue=min_queue_examples)
else:
print("Not shuffling!")
examples, label_batch = tf.train.batch(
[example, label],
batch_size=batch_size,
num_threads=num_preprocess_threads,
capacity=min_queue_examples + 6 * batch_size)
# Return the examples and the labels batches.
return examples, tf.reshape(label_batch, [batch_size])
lcs, nums = inputs(batch_size=3)
with tf.Session() as sess:
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord, sess=sess)
for i in xrange(0, 5):
my_lcs = lcs.eval()
my_nums = nums.eval()
print(str(my_lcs) + " --> " + str(my_nums))
非常感谢你的帮助!
答案 0 :(得分:4)
当您运行tv.eval()两次时,您实际上运行了两次图表,因此您将两个不同批次中的lcs和nums混合在一起,如果您将循环更改为以下内容,则会在以下期间拉出两个张量同样的图表运行:
my_lcs, my_nums = sess.run([lcs, nums])
print(str(my_lcs) + " --> " + str(my_nums))
这在我身边:
[[b'g' b'j']
[b'h' b'v']
[b'i' b'z']] --> [6 7 8]
[[b'f' b'e']
[b'g' b'j']
[b'h' b'v']] --> [5 6 7]
[[b'e' b'd']
[b'f' b'e']
[b'g' b'j']] --> [4 5 6]
[[b'd' b'g']
[b'e' b'd']
[b'f' b'e']] --> [3 4 5]
[[b'c' b'w']
[b'd' b'g']
[b'e' b'd']] --> [2 3 4]