对于我的工作量,我需要序列化大小为每个Dataframe 5Go的磁盘Pandas数据帧(Text + Datas)。 遇到了各种解决方案:
HDF5 : Issues with string
Feather: not stable
CSV: Ok, but large file size.
pickle : Ok, cross-platform, can we do better ?
gzip : Same than CSV (slow for read access).
SFrame: Good, but not maintained anymore.
只是想知道任何替代解决方案,以便将字符串Dataframe存储在磁盘上?