IOError:[Errno 22]加载镶木地板文件

时间:2018-04-23 20:39:22

标签: python-2.7 parquet pyarrow

我有像下面的示例数据一样的镶木地板数据。我正在尝试使用下面的代码将其加载到数据框中。我正在使用的引擎是pyarrow。我有其他文件,它工作正常,但当我尝试加载此文件。我收到以下错误。我是新手,实木复合地有人看到问题所在吗?

代码:

pd.read_parquet('/tmp/dt=20/09_0')

错误:

ArrowIOErrorTraceback (most recent call last)
<ipython-input-20-23dfd4ca529a> in <module>()
----> 1 view_df=pd.read_parquet('/data_tmp/view_coremetrics/dt=20180402/000119_0')
      2 # view_df=pd.read_parquet('/data_tmp/000031_0')
      3 print view_df.shape
      4 view_df.head()

/data2/user1/anaconda2/lib/python2.7/site-packages/pandas/io/parquet.pyc in read_parquet(path, engine, columns, **kwargs)
    255 
    256     impl = get_engine(engine)
--> 257     return impl.read(path, columns=columns, **kwargs)

/data2/user1/anaconda2/lib/python2.7/site-packages/pandas/io/parquet.pyc in read(self, path, columns, **kwargs)
    128         kwargs['use_pandas_metadata'] = True
    129         return self.api.parquet.read_table(path, columns=columns,
--> 130                                            **kwargs).to_pandas()
    131 
    132     def _validate_write_lt_070(self, df):

/data2/user1/anaconda2/lib/python2.7/site-packages/pyarrow/parquet.pyc in read_table(source, columns, nthreads, metadata, use_pandas_metadata)
    937             return fs.read_parquet(source, columns=columns, metadata=metadata)
    938 
--> 939     pf = ParquetFile(source, metadata=metadata)
    940     return pf.read(columns=columns, nthreads=nthreads,
    941                    use_pandas_metadata=use_pandas_metadata)

/data2/user1/anaconda2/lib/python2.7/site-packages/pyarrow/parquet.pyc in __init__(self, source, metadata, common_metadata)
     62         self.reader = ParquetReader()
     63         source = _ensure_file(source)
---> 64         self.reader.open(source, metadata=metadata)
     65         self.common_metadata = common_metadata
     66         self._nested_paths_by_prefix = self._build_nested_paths()

_parquet.pyx in pyarrow._parquet.ParquetReader.open()

error.pxi in pyarrow.lib.check_status()

ArrowIOError: Arrow error: IOError: [Errno 22] Invalid argument


Data:

PAR1??x??xLҢ924217587908548913115647362798388396398451534680690245436174253535301832948328446784820483655304337818520249518423095384646626994297369124175421698306711617314169483532812118925257912118483068693626684028851435422618056045560553866002671256164797432939995779833592483738186675911756683298492596228339721443259180385356757426207851989658054881511280641692601503861637822470631692909600167537024514

0 个答案:

没有答案