我有一个表格的DataFrame:
<table border="1" class="dataframe">\n
<thead>\n
<tr style="text-align: right;">\n
<th></th>\n
<th></th>\n
<th>Panama</th>\n</tr>\n
<tr>\n
<th>Contract</th>\n
<th>Date</th>\n
<th></th>\n</tr>\n</thead>\n
<tbody>\n
<tr>\n
<th rowspan="22" valign="top">201501</th>\n
<th>2014-04-29</th>\n
<td>1416.0</td>\n</tr>\n
<tr>\n
<th>2014-04-30</th>\n
<td>1431.1</td>\n</tr>\n
<tr>\n
<th>2014-05-01</th>\n
<td>1430.6</td>\n</tr>\n
<tr>\n
<th>2014-05-02</th>\n
<td>1443.9</td>\n</tr>\n
<tr>\n
<th>2014-05-05</th>\n
<td>1451.6</td>\n</tr>\n
<tr>\n
<th>2014-05-06</th>\n
<td>1461.4</td>\n</tr>\n
<tr>\n
<th>2014-05-07</th>\n
<td>1456.0</td>\n</tr>\n
<tr>\n
<th>2014-05-08</th>\n
<td>1441.1</td>\n</tr>\n
<tr>\n
<th>2014-05-09</th>\n
<td>1437.8</td>\n</tr>\n
<tr>\n
<th>2014-05-12</th>\n
<td>1445.2</td>\n</tr>\n
<tr>\n
<th>2014-05-13</th>\n
<td>1458.2</td>\n</tr>\n
<tr>\n
<th>2014-05-14</th>\n
<td>1487.6</td>\n</tr>\n
<tr>\n
<th>2014-05-15</th>\n
<td>1477.6</td>\n</tr>\n
<tr>\n
<th>2014-05-16</th>\n
<td>1467.9</td>\n</tr>\n
<tr>\n
<th>2014-05-19</th>\n
<td>1484.9</td>\n</tr>\n
<tr>\n
<th>2014-05-20</th>\n
<td>1470.5</td>\n</tr>\n
<tr>\n
<th>2014-05-21</th>\n
<td>1476.9</td>\n</tr>\n
<tr>\n
<th>2014-05-22</th>\n
<td>1490.0</td>\n</tr>\n
<tr>\n
<th>2014-05-23</th>\n
<td>1473.3</td>\n</tr>\n
<tr>\n
<th>2014-05-27</th>\n
<td>1462.5</td>\n</tr>\n
<tr>\n
<th>2014-05-28</th>\n
<td>1456.3</td>\n</tr>\n
<tr>\n
<th>2014-05-29</th>\n
<td>1460.5</td>\n</tr>\n
<tr>\n
<th rowspan="271" valign="top">201507</th>\n
<th>2014-05-30</th>\n
<td>1463.5</td>\n</tr>\n
<tr>\n
<th>2014-06-02</th>\n
<td>1447.5</td>\n</tr>\n
<tr>\n
<th>2014-06-03</th>\n
<td>1444.4</td>\n</tr>\n
<tr>\n
<th>2014-06-04</th>\n
<td>1444.7</td>\n</tr>\n
<tr>\n
<th>2014-06-05</th>\n
<td>1455.9</td>\n</tr>\n
<tr>\n
<th>2014-06-06</th>\n
<td>1464.0</td>\n</tr>\n
<tr>\n
<th>2014-06-09</th>\n
<td>1465.5</td>\n</tr>\n
<tr>\n
<th>2014-06-10</th>\n
<td>1493.5</td>\n</tr>\n
<tr>\n
<th>2014-06-11</th>\n
<td>1492.3</td>\n</tr>\n
<tr>\n
<th>2014-06-12</th>\n
<td>1452.6</td>\n</tr>\n
<tr>\n
<th>2014-06-13</th>\n
<td>1446.3</td>\n</tr>\n
<tr>\n
<th>2014-06-16</th>\n
<td>1450.7</td>\n</tr>\n
<tr>\n
<th>2014-06-17</th>\n
<td>1454.9</td>\n</tr>\n
<tr>\n
<th>2014-06-18</th>\n
<td>1462.6</td>\n</tr>\n
<tr>\n
<th>2014-06-19</th>\n
<td>1486.2</td>\n</tr>\n
<tr>\n
<th>2014-06-20</th>\n
<td>1469.1</td>\n</tr>\n
<tr>\n
<th>2014-06-23</th>\n
<td>1468.5</td>\n</tr>\n
<tr>
<th>2014-06-24</th>\n
<td>1484.2</td>\n</tr>\n
<tr>
<th>2014-06-25</th>\n
<td>1485.1</td>\n</tr>\n
<tr>
<th>2014-06-26</th>\n
<td>1482.2</td>\n</tr>\n
<tr>
<th>2014-06-27</th>\n
<td>1491.5</td>\n</tr>\n
</tbody>
</table>
当我p<1
时,我对所有值都得到'假',但当我做一个子集时:
p[p<1]
,它返回一些值,列条目的NaN为。熊猫遵循什么逻辑?
答案 0 :(得分:0)
如果使用boolean indexing
:
p < 1
,则会获得值
print (p)
Panama
Contract Date
201501 2014-04-29 -1416.0
2014-04-30 1431.1
2014-05-01 1430.6
2014-05-22 1490.0
2014-05-23 1473.3
2014-05-27 1462.5
2014-05-28 1456.3
201507 2014-06-23 1468.5
2014-06-24 1484.2
2014-06-25 1485.1
2014-06-26 1482.2
2014-06-27 -1491.5
print (p[p<1])
Panama
Contract Date
201501 2014-04-29 -1416.0
2014-04-30 NaN
2014-05-01 NaN
2014-05-22 NaN
2014-05-23 NaN
2014-05-27 NaN
2014-05-28 NaN
201507 2014-06-23 NaN
2014-06-24 NaN
2014-06-25 NaN
2014-06-26 NaN
2014-06-27 -1491.5
如果需要删除NaN
:
print (p[p.Panama < 1])
Panama
Contract Date
201501 2014-04-29 -1416.0
201507 2014-06-27 -1491.5
masks
的差异是:
#boolean DataFrame
print (p < 1)
Panama
Contract Date
201501 2014-04-29 True
2014-04-30 False
2014-05-01 False
2014-05-22 False
2014-05-23 False
2014-05-27 False
2014-05-28 False
201507 2014-06-23 False
2014-06-24 False
2014-06-25 False
2014-06-26 False
2014-06-27 True
#boolean Series
print (p.Panama < 1)
Contract Date
201501 2014-04-29 True
2014-04-30 False
2014-05-01 False
2014-05-22 False
2014-05-23 False
2014-05-27 False
2014-05-28 False
201507 2014-06-23 False
2014-06-24 False
2014-06-25 False
2014-06-26 False
2014-06-27 True
Name: Panama, dtype: bool