从RandomShuffleQueue出队不会减小尺寸

时间:2017-03-22 08:30:35

标签: tensorflow

为了训练模型,我将我的模型封装在一个类中。 我使用tf.RandomShuffleQueue将文件名列表排入队列。 但是,当我将元素出列队列时,它们会出列但队列的大小不会减少。

以下是代码段后面的更具体的问题:

  1. 例如,如果我只有5张图片,但步数范围高达100,这是否会导致addfilenames自动重复调用?它没有给我任何关于出队的错误,所以我认为它会被自动调用。
  2. 为什么tf.RandomShuffleQueue的大小没有变化?它保持不变。

    import os
    import time
    import functools
    import tensorflow as tf
    from Read_labelclsloc import readlabel
    
    
    def ReadTrain(traindir):
        # Returns a list of training images, their labels and a dictionay.
        # The dictionary maps label names to integer numbers.                                                                    
        return trainimgs, trainlbls, classdict
    
    
    def ReadVal(valdir, classdict):
       # Reads the validation image labels.
       # Returns a dictionary with filenames as keys and 
       # corresponding labels as values.
        return valdict
    
    def lazy_property(function):
      # Just a decorator to make sure that on repeated calls to 
      # member functions, ops don't get created repeatedly.
      # Acknowledgements : https://danijar.com/structuring-your-tensorflow-models/
        attribute= '_cache_' + function.__name__
        @property
        @functools.wraps(function)
        def decorator(self):
            if not hasattr(self, attribute):
                setattr(self, attribute, function(self))
            return getattr(self, attribute)
    
        return decorator    
    
    class ModelInitial:
    
        def __init__(self, traindir, valdir):
            self.graph
            self.traindir = traindir
            self.valdir = valdir
            self.traininginfo()
            self.epoch = 0
    
    
    
        def traininginfo(self):
            self.trainimgs, self.trainlbls, self.classdict = ReadTrain(self.traindir)
            self.valdict = ReadVal(self.valdir, self.classdict)
            with self.graph.as_default():
                self.trainimgs_tensor = tf.constant(self.trainimgs)
                self.trainlbls_tensor = tf.constant(self.trainlbls, dtype=tf.uint16)
                self.trainimgs_dict = {}
                self.trainimgs_dict["ImageFile"] = self.trainimgs_tensor
            return None
    
        @lazy_property
        def graph(self):
            g = tf.Graph()
            with g.as_default():
               # Layer definitions go here 
            return g
    
    
        @lazy_property
        def addfilenames (self):
       # This is the function where filenames are pushed to a RandomShuffleQueue
            filename_queue = tf.RandomShuffleQueue(capacity=len(self.trainimgs), min_after_dequeue=0,\
                                                   dtypes=[tf.string], names=["ImageFile"],\
                                                   seed=0, name="filename_queue")
    
            sz_op = filename_queue.size()
    
            dq_op = filename_queue.dequeue()
    
            enq_op = filename_queue.enqueue_many(self.trainimgs_dict)
            return filename_queue, enq_op, sz_op, dq_op
    
        def Train(self):
        # The function for training.
        # I have not written the training part yet.
        # Still struggling with preprocessing 
            with self.graph.as_default():
                filename_q, filename_enqueue_op, sz_op, dq_op= self.addfilenames
    
                qr = tf.train.QueueRunner(filename_q, [filename_enqueue_op])
                filename_dequeue_op = filename_q.dequeue()
                init_op = tf.global_variables_initializer()
    
            sess = tf.Session(graph=self.graph)
            sess.run(init_op)
            coord = tf.train.Coordinator()
            enq_threads = qr.create_threads(sess, coord=coord, start=True)
            counter = 0
            for step in range(100):
                print(sess.run(dq_op["ImageFile"]))
                print("Epoch = %d "%(self.epoch))
                print("size = %d"%(sess.run(sz_op)))
                counter+=1
    
            names = [n.name for n in self.graph.as_graph_def().node]
            coord.request_stop()
            coord.join(enq_threads)
            print("Counter = %d"%(counter))
            return None
    
    
    
    
    
    if __name__ == "__main__":
        modeltrain = ModelInitial(<Path to training images>,\
                                        <Path to validation images>)
        a = modeltrain.graph
        print(a)
        modeltrain.Train()
        print("Success")
    

1 个答案:

答案 0 :(得分:1)

这个谜是由您为队列创建的this引起的,这会导致它在后台填充。

  1. 以下行导致创建后台“queue runner”线程:

    qr = tf.train.QueueRunner(filename_q, [filename_enqueue_op])
    # ... 
    enq_threads = qr.create_threads(sess, coord=coord, start=True)
    

    此线程在循环中调用filename_enqueue_op,这会导致在从中删除元素时填充队列。

  2. 步骤1中的后台线程几乎总是在队列上有一个挂起的入队操作(filename_enqueue_op)。这意味着在您使文件名出列后,挂起的enqueue将运行add,将队列填充回容量。 (从技术上讲,这里存在竞争条件,您可以看到capacity - 1的大小,但这是不太可能的。)