Question

我正在尝试使用Google Cloud ML训练LSTM。训练数据集大约有600 FontSize="{Binding SelectedFont, UpdateSourceTrigger=PropertyChanged}"个文件，每个文件大约147mb。总的来说，数据集大约为90GB。

当我使用其中一些import cv2 import numpy as np # Take each frame filename = 'img2.png' img = cv2.imread(filename, 1) # Convert BGR to HSV hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) # define range of blue color in HSV lower_blue = np.array([110,50,50]) upper_blue = np.array([130,255,255]) # Threshold the HSV image to get only blue colors image_final = cv2.inRange(hsv, lower_blue, upper_blue) # Bitwise-AND mask and original image res = cv2.bitwise_and(img,img, mask= mask) cv2.imshow('frame',img) cv2.imwrite('mask.png',image_final) cv2.waitKey(0)进行云训练时，开始训练非常快。但是当我使用所有tfrecords时，该作业在此状态下等待约1小时：

tfrecords

然后它运行2000步，并在同一状态下等待和小时。我对Tensorflow很新，但我的意思是因为读数据集很慢？所有数据都在GCS中与作业区域相同的区域。

我用来读取数据集的代码是：

tfrecords

感谢您的帮助，如果有人需要，我们很乐意提供更多详细信息。

Answer 1

尝试将num_parallel_reads=N（对于N的某个值，例如32或64）添加到tf.data.TFRecordDataset调用中。当前代码正在一次读取一个文件，这可能会很慢。

使用谷歌Cloud ML在Tensorflow中读取大数据集很慢

1 个答案: