我能够成功读取位于pandas s3上的csv,如下所示:
pdf=pd.read_csv('s3a:///bucket/input_file.csv')
但是,如果我尝试在dask中执行相同的操作,请执行以下操作:
ddf=dd.read_csv('s3a:///bucket/input_file.csv')
但是我收到以下错误:
<ipython-input-29-07e2d394bb38> in <module>()
----> 1 ddf=dd.read_csv('s3a:///bucket/input_file.csv')
/usr/local/anaconda2/lib/python2.7/site-packages/dask/dataframe/io.pyc in read_csv(fn, **kwargs)
220 return concat([read_csv(f, **kwargs) for f in sorted(glob(fn))])
221
--> 222 token = tokenize(os.path.getmtime(fn), kwargs)
223 name = 'read-csv-%s-%s' % (fn, token)
224 bom = get_bom(fn, kwargs.get('compression', None))
...
OSError: [Errno 2] No such file or directory:'s3a:///bucket/input_file.csv'
我也尝试过使用:
ddf=dd.read_csv('s3://bucket/input_file.csv')
并得到同样的错误。