我正在尝试使用Python大熊猫对HDF5格式的表进行简单的过滤。当我仅通过“主题”列查询时,效果很好:
> df_test = pd.read_hdf(result_file, where=['subject==andrew'])
> print(df_test)
哪个给出输出:
subject condition time pupil_diam luminance gaze_x gaze_y
... ... ... ... ... ... ... ...
180519 andrew light 5885480250 2.50 0.768958 1723.85 267.11
180520 andrew light 5885482247 2.50 0.769088 1723.33 266.81
180521 andrew light 5885484249 2.51 0.769405 1718.93 267.91
当我仅通过“亮度”列查询时,也可以使用:
> df_test = pd.read_hdf(params['result_file'], where=['luminance>0'])
> print(df_test)
subject condition time pupil_diam luminance gaze_x gaze_y
79005 mary light 3813968998 3.22 0.225418 257.11 761.28
79006 mary light 3813970992 3.22 0.227119 256.38 761.13
79007 mary light 3813972992 3.21 0.227119 256.13 760.53
... ... ... ... ... ... ... ...
但是将它们与“&”放在一起会得到空的结果(如您在上面看到的,肯定有两个条件都成立的行):
> df_test = pd.read_hdf(params['result_file'], where=['subject==andrew & luminance>0'])
> print(df_test)
Empty DataFrame
Columns: [subject, condition, time, pupil_diam, luminance, gaze_x, gaze_y]
Index: []
尽管此查询在我使用时有效:
> df_test = pd.read_hdf(params['result_file'], where=['subject==mary & luminance>0'])
> print(df_test)
subject condition time pupil_diam luminance gaze_x gaze_y
79005 mary light 3813968998 3.22 0.225418 257.11 761.28
79006 mary light 3813970992 3.22 0.227119 256.38 761.13
79007 mary light 3813972992 3.21 0.227119 256.13 760.53
... ... ... ... ... ... ... ...
熊猫新手,所以可能是我缺少一些东西。语法,但尚未在文档或在线论坛中找到合适的解决方案/说明...