如何使用Tensorflow结合数字数据训练图像分类

时间:2018-12-30 12:19:37

标签: python tensorflow keras

请原谅我的尝试,仍然对深度学习和Tensorflow不满意。

我有两个火车数据流,一个是图像,第二个是连续数字。 ImageDataGenerator keras实现用于图像,tf.estimator.inputs.numpy_input_fn用于数字。知道这两个流程彼此之间是有序的。

将一个生成器与两者结合起来,贷方获得https://github.com/keras-team/keras/issues/8130#issuecomment-336855177的信用。

从那里,我想到了这一点:

slice_and_order:从文件夹中获取正确的图像集(如果正在训练或验证中)。

import tensorflow as tf 
input_imgen = ImageDataGenerator(rescale = 1./255, 
                                   shear_range = 0.2, 
                                   zoom_range = 0.2,
                                   rotation_range=5.,
                                   horizontal_flip = True)

test_imgen = ImageDataGenerator(rescale = 1./255)


def slice_and_order(gen, cut, train_boolean):
  filenames = gen.filenames #[idx : idx + gen.batch_size]
  filenames = [int(filename.split('/')[1].split('.')[0]) for filename in filenames]
  co=-1
  length = len(filenames)
  for i in gen:
    idx = (gen.batch_index - 1) * gen.batch_size
    current = int(gen.filenames[idx].split('/')[1].split('.')[0])
    co+=1
    if(train_boolean):
      if(co%length < cut):
        yield i
      else:
        continue
    else:
      if(co%length < cut):
        continue
      else:
        yield i

def generate_generator_multiple(generator, dir1, batch_size, img_height, img_width, train_boolean):
    genX1 = generator.flow_from_directory(dir1,
                                          target_size = (img_height,img_width),
                                          class_mode = 'categorical',
                                          batch_size = 1,
                                          shuffle=False, 
                                          seed=7)
    genX1_ = None;
    if(train_boolean):
      genX1_ = slice_and_order(genX1, len(ind_train), train_boolean)
    else:
      genX1_ = slice_and_order(genX1, len(ind_train), train_boolean)

    genX2_ = None;
    with tf.Session() as session:
        if(train_boolean):
          genX2 = tf.estimator.inputs.numpy_input_fn(
              res_train_flow['x'], res_train_flow['y'], batch_size=1, shuffle=False, num_epochs=1)
        else:
          genX2 = tf.estimator.inputs.numpy_input_fn(
              res_valid_flow['x'], res_valid_flow['y'], batch_size=1, shuffle=False, num_epochs=1)

    while True:
        val = genX1.next()
        features, target = genX2()
        yield [val[0], features], target  #Yield both images and their mutual label


inputgenerator=generate_generator_multiple(generator=input_imgen,
                                           dir1="images/",
                                           batch_size=1,
                                           img_height=96,
                                           img_width=96, train_boolean = True)       

validgenerator=generate_generator_multiple(generator=test_imgen,
                                          dir1="images/",
                                          batch_size=1,
                                          img_height=96,
                                          img_width=96, train_boolean = False)

现在我的问题是输入数据是异质性,例如

([array([[[[0.8980393 , 0.9215687 , 0.9058824 ],
           [0.8965367 , 0.9200661 , 0.90437984],
           [0.89043367, 0.9139631 , 0.8982768 ],
           ...,
           [0.7747699 , 0.79437774, 0.77084833],
           [0.77384806, 0.7934559 , 0.7699265 ],
           [0.7729261 , 0.79253393, 0.7690045 ]],
          ...,

          [[0.7760461 , 0.7721245 , 0.7525167 ],
           [0.77819717, 0.7742756 , 0.75466776],
           [0.7803483 , 0.77642673, 0.7568189 ],
           ...,
           [0.7188979 , 0.7385057 , 0.7502704 ],
           [0.70090973, 0.7205176 , 0.7322823 ],
           [0.7019608 , 0.72156864, 0.73333335]],

          [[0.8027476 , 0.79882604, 0.7792182 ],
           [0.8032237 , 0.7997209 , 0.78011304],
           [0.8016872 , 0.7991063 , 0.77949846],
           ...,
           [0.7273756 , 0.74698347, 0.7587482 ],
           [0.7003445 , 0.7199524 , 0.7317171 ],
           [0.7019608 , 0.72156864, 0.73333335]]]], dtype=float32),
  {'max_lat': <tf.Tensor 'fifo_queue_DequeueUpTo_9:1' shape=(?,) dtype=float64>,
   'max_lon': <tf.Tensor 'fifo_queue_DequeueUpTo_9:2' shape=(?,) dtype=float64>,
   'min_lat': <tf.Tensor 'fifo_queue_DequeueUpTo_9:3' shape=(?,) dtype=float64>,
   'min_lon': <tf.Tensor 'fifo_queue_DequeueUpTo_9:4' shape=(?,) dtype=float64>}],
 <tf.Tensor 'fifo_queue_DequeueUpTo_9:5' shape=(?,) dtype=float64>) 

我可以为此输入使用一种模型吗?还是应该让每个生成器分别使用两个模型(对于图像是卷积模型,对于数字是Logistic模型)分开,然后分别喂入每个模型,然后将两者组合以进行输出,对我来说似乎很复杂。

0 个答案:

没有答案