如何将熊猫数据帧写入.arrow文件

时间:2020-11-01 07:40:59

标签: pandas apache-arrow

如何将.arrow格式的pandas数据帧写入磁盘?我希望能够将箭here读取到Arquero中。

2 个答案:

答案 0 :(得分:2)

由于羽毛是箭头IPC格式,因此您可以仅使用write_feather。参见http://arrow.apache.org/docs/python/feather.html

答案 1 :(得分:0)

您可以执行以下操作:

import pyarrow as pa
import pandas as pd 

df = pd.read_parquet('your_file.parquet')

schema = pa.Schema.from_pandas(df, preserve_index=False)
table = pa.Table.from_pandas(df, preserve_index=False)

sink = "myfile.arrow"

# Note new_file creates a RecordBatchFileWriter 
writer = pa.ipc.new_file(sink, schema)
writer.write(table)
writer.close()