使用Python的scipy(来自Google Cloud存储桶)加载matlab文件

时间:2017-12-26 14:05:04

标签: python matlab google-cloud-platform google-cloud-storage google-cloud-ml

我尝试使用Google Cloud ML引擎运行Keras多层感知器模型(遵循https://github.com/clintonreece/keras-cloud-ml-enginehttp://liufuyang.github.io/2017/04/02/just-another-tensorflow-beginner-guide-4.html等教程中提出的格式)我的数据集位于.mat文件的形式(据我所知,它不是7.3格式,因此不需要HDF5)。

训练集文件位于名为" data"的文件中。在名为project_1的Google云端存储分区中;我也将它们存储在本地。我修改了我的模型以供云使用,以便加载.mat文件,如下所示:

    def train_model (train_file='data', job_dir='./tmp/mlp2', **args):

         with file_io.FileIO(train_file + '/train_subject01.mat', mode='r') as a:
                  train_data = scipy.io.loadmat(a)

当我使用gcloud命令(使用--train-file ./data)在本地运行模型时,它可以顺利运行。但是,当我尝试使用

部署它以在云上运行时
    $ export BUCKET_NAME=project_1
    ....
    > --train-file gs://$BUCKET_NAME/data

似乎是常见做法,我收到如下错误消息:

   The replica master 0 exited with a non-zero status of 1. Termination reason: Error. 
   Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) 
   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals 
   File "/root/.local/lib/python2.7/site-packages/trainer/mlp2.py", line 195, in <module> train_model(**arguments) 
   File "/root/.local/lib/python2.7/site-packages/trainer/mlp2.py", line 39, in train_model train_data = scipy.io.loadmat(a) 
   File "/usr/local/lib/python2.7/dist-packages/scipy/io/matlab/mio.py", line 135, in loadmat matfile_dict = MR.get_variables(variable_names) 
   File "/usr/local/lib/python2.7/dist-packages/scipy/io/matlab/mio5.py", line 272, in get_variables hdr, next_position = self.read_var_header() 
   File "/usr/local/lib/python2.7/dist-packages/scipy/io/matlab/mio5.py", line 232, in read_var_header header = self._matrix_reader.read_header(check_stream_limit) 
   File "scipy/io/matlab/mio5_utils.pyx", line 558, in scipy.io.matlab.mio5_utils.VarReader5.read_header (scipy/io/matlab/mio5_utils.c:5684) 
   File "scipy/io/matlab/mio5_utils.pyx", line 610, in scipy.io.matlab.mio5_utils.VarReader5.read_header (scipy/io/matlab/mio5_utils.c:5609) 
   File "scipy/io/matlab/mio5_utils.pyx", line 481, in scipy.io.matlab.mio5_utils.VarReader5.read_int8_string (scipy/io/matlab/mio5_utils.c:4635) 
   File "scipy/io/matlab/mio5_utils.pyx", line 362, in scipy.io.matlab.mio5_utils.VarReader5.read_element (scipy/io/matlab/mio5_utils.c:3994) 
   File "scipy/io/matlab/streams.pyx", line 55, in scipy.io.matlab.streams.GenericStream.seek (scipy/io/matlab/streams.c:1401) 
   TypeError: seek() takes exactly 2 arguments (3 given) 

我不知道这个seek()错误意味着什么!我是否使用正确的方法加载文件,如果是这样,为什么问题突然出现?是否有另一种加载文件的方法?

0 个答案:

没有答案