我在运行此示例时遇到问题TensorFlowOnSpark on a Spark Standalone cluster (Single Host):
执行 mnist_data_setup.py 文件后,它会正确提取MNIST zip文件。但是通过调用 extract_images(filename)函数,它会面临错误。请参阅以下错误:
Extracting <open file 'FILE_PATH_IN_MT_PC/train-images-idx3-ubyte.gz', mode 'rb' at 0x7ff3423e5c00>
Traceback (most recent call last):
File "FILE_PATH_IN_MT_PC/mnist/mnist_data_setup.py", line 144, in <module>
writeMNIST(sc, "FILE_PATH_IN_MT_PC/train-images-idx3-ubyte.gz", "FILE_PATH_IN_MT_PC/train-labels-idx1-ubyte.gz", args.output + "/train", args.format, args.num_partitions)
File "/FILE_PATH_IN_MT_PC/mnist/mnist_data_setup.py", line 52, in writeMNIST
images = numpy.array(mnist.extract_images(f))
File "FILE_PATH_IN_MT_PC/tensorflow/contrib/learn/python/learn/datasets/mnist.py", line 42, in extract_images
with tf.gfile.Open(filename, 'rb') as f, gzip.GzipFile(fileobj=f) as bytestream:
File "FILE_PATH_IN_MT_PC/tensorflow/python/platform/gfile.py", line 452, in Open
return GFile(name, mode=mode)
File "FILE_PATH_IN_MT_PC/tensorflow/python/platform/gfile.py", line 215, in __init__
super(GFile, self).__init__(name, mode, _Pythonlocker())
File "FILE_PATH_IN_MT_PC/tensorflow/python/platform/gfile.py", line 63, in __init__
self._fp = open(name, mode)
TypeError: coercing to Unicode: need string or buffer, file found
如果有人帮我找到解决办法,我会很高兴的。 提前致谢
答案 0 :(得分:1)
我认为在open
中,您为file
变量提供string
类型对象而不是name
。
我做了更多的挖掘:
在images = numpy.array(mnist.extract_images(f))
中,f
是一个文件对象。
但是with tf.gfile.Open(filename, 'rb') as f, gzip.GzipFile(fileobj=f) as bytestream:
,这会将images = numpy.array(mnist.extract_images(f))
传递的参数视为文件名。
此行为未显示在最新版本中:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/mnist.py