我尝试在code
中运行keras blog post.代码写入.npy文件,如下所示:
bottleneck_features_train = model.predict_generator(generator, nb_train_samples // batch_size)
np.save(open('bottleneck_features_train.npy', 'w'),bottleneck_features_train)
然后从该文件中读取:
def train_top_model():
train_data = np.load(open('bottleneck_features_train.npy'))
现在我收到一条错误说:
Found 2000 images belonging to 2 classes.
Traceback (most recent call last):
File "kerasbottleneck.py", line 103, in <module>
save_bottlebeck_features()
File "kerasbottleneck.py", line 69, in save_bottlebeck_features
np.save(open('bottleneck_features_train.npy', 'w'),bottleneck_features_train)
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 511, in save
pickle_kwargs=pickle_kwargs)
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 565, in write_array
version)
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 335, in _write_array_header
fp.write(header_prefix)
TypeError: write() argument must be str, not bytes
在此之后,我尝试从&#39; w&#39;更改文件模式。到了'wb&#39;。这导致在读取文件时出错:
Found 2000 images belonging to 2 classes.
Found 800 images belonging to 2 classes.
Traceback (most recent call last):
File "kerasbottleneck.py", line 104, in <module>
train_top_model()
File "kerasbottleneck.py", line 82, in train_top_model
train_data = np.load(open('bottleneck_features_train.npy'))
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 404, in load
magic = fid.read(N)
File "/opt/anaconda3/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte
如何解决此错误?
答案 0 :(得分:5)
博客文章中的代码针对的是Python 2,其中写入和读取文件与字节串一起使用。在Python 3中,您需要以二进制模式打开文件,无论是写入还是再次阅读:
np.save(
open('bottleneck_features_train.npy', 'wb'),
bottleneck_features_train)
阅读时:
train_data = np.load(open('bottleneck_features_train.npy', 'rb'))
注意模式参数中的b
字符。
我将该文件用作上下文管理器,以确保它完全关闭:
with open('bottleneck_features_train.npy', 'wb') as features_train_file
np.save(features_train_file, bottleneck_features_train)
和
with open('bottleneck_features_train.npy', 'wb') as features_train_file:
train_data = np.load(features_train_file)
博客文章中的代码应该使用这两个更改,因为在Python 2中,模式中没有b
标志文本文件已经翻译了特定于平台的换行约定,在Windows上,流中的某些字符具有特定含义(包括使文件看起来比实际显示的EOF字符时更短)。二进制数据可能是一个真正的问题。