应用错误收集

我有一个HDF5文件test1.h5，它生成py Pandas df.to_hdf（＆＃34; test1.h5＆＃34;，＆＃34; t＆＃34;）调用。文件大小为27M，只有一个密钥是pandas数据帧。

s1 = pd.HDFStore("test1.h5")

<class 'pandas.io.pytables.HDFStore'>
File path: test1.h5
/t            frame        (shape->[999,2161])

数据帧的大部分dtypes都是float32，其中包含一个int和string

In [21]: s1['/t'].dtypes.value_counts()
Out[21]: 
float32    2156
object        3
float64       1
int64         1
dtype: int64

让我感到困惑的是，如果我将数据帧再次保存到另一个HDF5文件test2.h5，它只有9.7M：

s1['/t'].to_hdf("test2.h5","t")

是什么导致读写之间的大小不同？谢谢。