我有一个pandas.DataFrame
,其中包含数值,日期值和文本值。像这样:
Strike StrikeCell Expiration ExpirationCell CellContents
0 60.0 \n <div class="cell row-header strike itm" ... 2016-07-15 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="60.0" m...
1 60.0 \n <div class="cell row-header strike itm" ... 2017-01-20 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="60.0" m...
2 60.0 \n <div class="cell row-header strike itm" ... 2018-01-19 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="60.0"
13 70.0 \n <div class="cell row-header strike itm" ... 2017-01-20 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="70.0" m...
15 70.0 \n <div class="cell row-header strike itm" ... 2018-01-19 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="70.0" m...
17 70.0 \n <div class="cell row-header strike itm" ... 2016-10-21 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="70.0" m...
...
562 260.0 \n <div class="cell row-header strike otm" ... 2017-01-20 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="260.0" ...
564 270.0 \n <div class="cell row-header strike otm" ... 2017-01-20 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="270.0" ...
565 280.0 \n <div class="cell row-header strike otm" ... 2017-01-20 \n <div class="cell col-header expiration">... \n <div class="cell option" strike="280.0" ...
我的目的是让StrikeCell
沿着第一列(按升序排列),ExpirationCell
跨列(按升序排列)和CellContents
作为值内的值表。基本上我正在创建一个带有html格式内容的大型数据透视表。
我可以做以下工作:
df.pivot(index='Strike', columns='Expiration', values='CellContents')
Strike
已正确排序,Expiration
已正确排序。
但是,如果我尝试使用字符串内容StrikeCell
和ExpirationCell
,如下所示:
df.pivot(index='StrikeCell', columns='ExpirationCell', values='CellContents')
排序丢失。
所以问题是如何在使用Strike
作为Expiration
和StrikeCell
作为index
时,按Expirationcell
和columns
重新获得升序排序}。
使用pandas 0.18.1
。
答案 0 :(得分:1)
我相信这对你有用。
首先,让我们修复ExpirationCell
和StrikeCell
的订单。
StrikeCell_ordered = df[['Strike', 'StrikeCell']].sort_values(by='Strike')['StrikeCell']
ExpirationCell_ordered = df[['Expiration', 'ExpirationCell']].sort_values(by='Expiration')['ExpirationCell']
然后转动并应用reindex
:
pivoted_df = df.pivot(index='StrikeCell', columns='ExpirationCell', values='CellContents')
result = pivoted_df.reindex(index=StrikeCell_ordered, columns=ExpirationCell_ordered)