我正在尝试从csv中获取带有文件列表和标签列表的数据,并将其转换为使用tf.keras进行分类的单标签。我正在为代码使用急切模式。
我正在尝试遵循CS230中的tf.data示例来构建数据管道。
https://cs230-stanford.github.io/tensorflow-input-data.html
我的代码在代码部分下方。
列出所有图片位置的csv文件位于此处的保管箱中: https://www.dropbox.com/s/5uo8o1p30g2aeta/Clock.csv?dl=0
运行如下所示的代码时,我得到一个
TypeError: Cannot convert a TensorShape to dtype: <dtype: 'float32'>
error.
当我添加到第55行并进入第56行时:
one_hot_Hr = tf.one_hot(file.Hr,classes)
one_hot_Hr = tf.to_int32(one_hot_Hr)
我收到此错误:
InvalidArgumentError: cannot compute Mul as input #0 was expected to be
a float tensor but is a int32 tensor [Op:Mul]
name: loss/activation_2_loss/mul/
我跑步时
iterator.get_next()
图片的格式为
<tf.Tensor: id=12462, shape=(32, 300, 300, 3), dtype=float32, numpy=
标签的格式为:
<tf.Tensor: id=12463, shape=(32, 13), dtype=float32, numpy=
基于错误,似乎应该是标签的简单格式化问题,但是我很困惑,两个错误都不会在堆栈溢出时带来很多有用的信息。
代码:
import pandas as pd
import tensorflow as tf
import tensorflow.keras as k
#import cv2
#tf.enable_eager_execution()
#import argparse
#from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import Activation, Dropout, Flatten, Dense
def parse_function(filename, label):
image_string = tf.read_file(filename)
# Don't use tf.image.decode_image, or the output shape will be undefined
image = tf.image.decode_jpeg(image_string, channels=3)
# This will convert to float values in [0, 1]
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize_images(image, [300, 300])
return image, label
def train_preprocess(image, label):
image = tf.image.random_flip_left_right(image)
image = tf.image.random_brightness(image, max_delta=32.0 / 255.0)
image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
# Make sure the image is still in [0, 1]
image = tf.clip_by_value(image, 0.0, 1.0)
return image, label
batch_size = 32
classes = 13
fileLoc = "C:/Users/USAgData/TF/Clock.csv"
file = pd.read_csv(fileLoc)
file['Loc']=''
file.Loc = str(str(file.Location)[9:23] + str(file.Location)[28:46])
one_hot_Hr = tf.one_hot(file.Hr,classes)
#one_hot_Hr = tf.to_int32(one_hot_Hr)
dataset = tf.data.Dataset.from_tensor_slices((file.Loc, one_hot_Hr))
dataset = dataset.shuffle(len(file.Location))
dataset = dataset.map(parse_function, num_parallel_calls=4)
dataset = dataset.map(train_preprocess, num_parallel_calls=4)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(1)
#print(dataset.shape) # ==> "(tf.float32, tf.float32)"
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
#print(next_element)
tf.keras.backend.clear_session()
model_name="Documentation"
model = k.Sequential()
model.add(Conv2D(64, (3, 3), input_shape=(300,300,3))) #Changed shape to include batch
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Conv2D(32, (3, 3)))
#model.add(Activation('relu'))
#model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Conv2D(64, (3, 3)))
#model.add(Activation('relu'))
#model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(32))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(classes))
model.add(Activation('softmax')) #Changed from sigmoid
#changed from categorical cross entropy
model.compile(loss='categorical_crossentropy',
optimizer=tf.train.RMSPropOptimizer(.0001),
metrics=['accuracy'])
model.summary()
fitting = model.fit_generator(iterator,epochs =1 ,shuffle=False, steps_per_epoch=14400//batch_size)
#model.evaluate(dataset,steps=30)
import sys
print(sys.version)
tf.__version__
我正在跑步: tf:1.10.0 Python:3.6.7 | Anaconda自定义(64位)| (默认值,2018年12月10日,20:35:02)[MSC v.1915 64位(AMD64)]
我不知道这是否真的是解决方案,但是当我切换时:
fitting = model.fit_generator(iterator,epochs =1 ,shuffle=False, steps_per_epoch=14400//batch_size)
到
fitting = model.fit(iterator,epochs = 1 , shuffle = False, steps_per_epoch = 14400//batch_size)
模型开始训练。但是,然后它们的模型将用完数据点,因为迭代器将不会重新开始。