Question

我正在尝试使用Python大熊猫对HDF5格式的表进行简单的过滤。当我仅通过“主题”列查询时，效果很好：

> df_test = pd.read_hdf(result_file, where=['subject==andrew'])
> print(df_test)

哪个给出输出：

        subject condition        time  pupil_diam  luminance   gaze_x  gaze_y
 ...        ...       ...         ...         ...        ...      ...     ...
 180519  andrew     light  5885480250        2.50   0.768958  1723.85  267.11
 180520  andrew     light  5885482247        2.50   0.769088  1723.33  266.81
 180521  andrew     light  5885484249        2.51   0.769405  1718.93  267.91

当我仅通过“亮度”列查询时，也可以使用：

> df_test = pd.read_hdf(params['result_file'], where=['luminance>0'])
> print(df_test)


        subject condition        time  pupil_diam  luminance  gaze_x  gaze_y
 79005     mary     light  3813968998        3.22   0.225418  257.11  761.28
 79006     mary     light  3813970992        3.22   0.227119  256.38  761.13
 79007     mary     light  3813972992        3.21   0.227119  256.13  760.53
 ...        ...       ...         ...         ...        ...      ...     ...

但是将它们与“＆”放在一起会得到空的结果（如您在上面看到的，肯定有两个条件都成立的行）：

> df_test = pd.read_hdf(params['result_file'], where=['subject==andrew & luminance>0'])
> print(df_test)

Empty DataFrame
Columns: [subject, condition, time, pupil_diam, luminance, gaze_x, gaze_y]
Index: []

尽管此查询在我使用时有效：

> df_test = pd.read_hdf(params['result_file'], where=['subject==mary & luminance>0'])
> print(df_test)

       subject condition        time  pupil_diam  luminance  gaze_x  gaze_y
79005     mary     light  3813968998        3.22   0.225418  257.11  761.28
79006     mary     light  3813970992        3.22   0.227119  256.38  761.13
79007     mary     light  3813972992        3.21   0.227119  256.13  760.53
...        ...       ...         ...         ...        ...      ...     ...

熊猫新手，所以可能是我缺少一些东西。语法，但尚未在文档或在线论坛中找到合适的解决方案/说明...

Python Pandas read_hdf WHERE术语未按预期运行

0 个答案: