当我尝试查询我的数据框对象时,会发生以下情况:
df2.query('a==1')
.conda/envs/myenv2/lib/python2.7/site-packages/pandas/computation/align.pyc in _align_core(terms)
96 reindexer_size = len(reindexer)
97
---> 98 ordm = np.log10(abs(reindexer_size - term_axis_size))
99 if ordm >= 1 and reindexer_size >= 10000:
100 warnings.warn('Alignment difference on axis {0} is larger '
FloatingPointError: divide by zero encountered in log10
以下是我的数据框的外观:
In [16]: df2.head()
Out[16]:
S t S t S-elas t-elas
y 0.9 0.9 1.0 1.0
T a k c
1 0.1 0.2 0.4 NaN NaN 49.9547 0.935831 NaN NaN
0.5 48.4641 0.91747 51.6021 0.893594 -0.595826 0.250475
0.3 0.4 43.3879 0.930355 44.7448 0.935127 -0.292537 -0.0486025
0.5 43.1652 0.915278 45.5227 0.888028 -0.505053 0.287115
1.0 0.2 0.4 7.82282 0.922999 7.95204 0.93587 -0.155631 -0.131551
发生了什么事?无论我选择'T == 1','a == 1'm还是其他,它都不起作用。如何根据MultiIndex的部分值选择行?我在pandas版本0.17.0
上,我不愿意升级,因为conda尚不支持0.18
我从这个数据框开始
a c k y T S t
0 5 0.4 0.2 0.9 1 1.98521 0.90772
1 0.1 0.4 0.2 1 1 49.9547 0.935831
2 5 0.4 0.2 1 1 2.00426 0.905521
3 5 0.4 0.2 1 5 0.54526 4.02065
4 0.1 0.4 0.2 1 5 16.4644 4.26893
然后我希望以同样的方式为S
和t
提供y
:
df.set_index(['T', 'a', 'k', 'c', 'y'], inplace=True)
df = df.stack().unstack(4)
df2 = df.unstack().swaplevel(0, 1, axis=1)
这就是它现在的样子(加上一些额外的列)
或者,这是df2.reset_index()
:
In [36]: df2.reset_index()
Out[36]:
T a k c S t S t s-diff \
y 0.9 0.9 1.0 1.0
0 1 0.1 0.2 0.4 NaN NaN 49.9547 0.935831 NaN
1 1 0.1 0.2 0.5 48.4641 0.91747 51.6021 0.893594 -3.138
2 1 0.1 0.3 0.4 43.3879 0.930355 44.7448 0.935127 -1.35695
3 1 0.1 0.3 0.5 43.1652 0.915278 45.5227 0.888028 -2.35748
4 1 1.0 0.2 0.4 7.82282 0.922999 7.95204 0.93587 -0.129213
5 1 1.0 0.2 0.5 7.82921 0.910954 7.95381 0.908296 -0.124604
6 1 1.0 0.3 0.4 7.3131 0.923834 7.41049 0.929179 -0.0973891
7 1 1.0 0.3 0.5 7.31753 0.911348 7.45975 0.907871 -0.142221
8 1 5.0 0.2 0.4 1.98521 0.90772 2.00426 0.905521 -0.0190501
9 1 5.0 0.2 0.5 2.11855 0.9082 2.02325 0.891608 0.0953023
10 1 5.0 0.3 0.4 1.87945 0.908802 1.90995 0.937631 -0.0305015
11 1 5.0 0.3 0.5 1.88057 0.89712 1.91215 0.868879 -0.0315846
12 5 0.1 0.2 0.4 16.4244 3.52755 16.4644 4.26893 -0.0400153
13 5 0.1 0.3 0.4 NaN NaN 16.0271 3.61533 NaN
14 5 0.1 0.3 0.5 15.2237 3.80595 NaN NaN NaN
15 5 1.0 0.2 0.4 2.35366 3.80512 2.3596 3.79693 -0.0059349
16 5 1.0 0.2 0.5 2.34645 3.8176 2.51691 3.04467 -0.170462
17 5 1.0 0.3 0.4 2.19598 3.78134 2.21581 3.74906 -0.019823
18 5 1.0 0.3 0.5 2.34471 3.26898 2.34466 3.26885 4.97027e-05
19 5 5.0 0.2 0.4 0.541352 3.79937 0.54526 4.02065 -0.00390801
20 5 5.0 0.2 0.5 0.566614 3.87528 0.567599 3.8671 -0.000984482
21 5 5.0 0.3 0.4 0.520307 4.2113 0.51744 3.71401 0.00286746
22 5 5.0 0.3 0.5 0.554546 3.47341 0.55858 3.44815 -0.00403361
23 10 0.1 0.3 0.5 9.3711 6.70258 NaN NaN NaN
24 10 1.0 0.2 0.5 1.35114 6.84992 1.35207 6.84496 -0.000936299
t-diff S-elas t-elas
y
0 NaN NaN NaN
1 0.0238751 -0.595826 0.250475
2 -0.00477195 -0.292537 -0.0486025
3 0.0272504 -0.505053 0.287115
4 -0.0128704 -0.155631 -0.131551
5 0.00265812 -0.150001 0.027761
6 -0.00534445 -0.125675 -0.0547997
7 0.00347693 -0.182862 0.0363132
8 0.00219903 -0.0907268 0.0230424
9 0.0165924 0.437187 0.175161
10 -0.028829 -0.152934 -0.296654
11 0.0282413 -0.158226 0.303842
12 -0.741378 -0.0231171 -1.80674
13 NaN NaN NaN
14 NaN NaN NaN
15 0.00818596 -0.0239247 0.0204594
16 0.772931 -0.665956 2.14006
17 0.0322827 -0.0853706 0.0814527
18 0.000129179 0.000201381 0.000375417
19 -0.22128 -0.0683336 -0.537635
20 0.0081767 -0.0164918 0.0200658
21 0.497293 0.0525 1.1922
22 0.0252581 -0.0688499 0.0693348
23 NaN NaN NaN
24 0.0049623 -0.00658096 0.0068846
调用此数据框test
。然后test.set_index(['T', 'a', 'k', 'c'], inplace=True).query('a == 1')
给了我与上面相同的错误。