我正在对一些股票进行事件研究,这会产生一个大熊猫DataFrame,其中列是股票代码符号(SPY,GOOG,AAPL等),索引是时间戳。 DataFrame中的单元格的值为NaN或1.我想根据事件DataFrame生成订单DataFrame。由于我想每次为单元格== 1创建一个订单,我认为applymap是合适的。但是,似乎使用applymap剥离了它的索引和列的单元格。我尝试了下面的代码:
def appendOrder(orders, value):
if value == 1:
index = ["Year", "Month", "Day", "Stock", "OrderType", "Amount"]
s = pd.Series(index=index)
s["Stock"] = value.index
def createOrders(events):
columns = ["Year", "Month", "Day", "Stock", "OrderType", "Amount"]
orders = pd.DataFrame(columns=columns)
events.applymap(lambda x: appendOrder(orders,x))
上面的代码在appendOrder方法中断,因为value没有索引。
在DataFrame上使用applymap时,是否仍然保留索引和列信息?
修改
以下是事件DataFrame的片段:
SPY GOOG AAPL XOM
2013-10-1-16:00:00 NaN 1 NaN 1
2013-10-2-16:00:00 NaN NaN NaN NaN
2013-10-3-16:00:00 NaN NaN NaN NaN
2013-10-4-16:00:00 1 NaN NaN NaN
2013-10-5-16:00:00 NaN NaN NaN NaN
2013-10-6-16:00:00 1 NaN 1 NaN
2013-10-7-16:00:00 NaN NaN NaN NaN
2013-10-8-16:00:00 NaN 1 NaN NaN
我想将上面的事件DataFrame转换为下面的订单DataFrame:
Year Month Day Stock OrderType Amount
0 2013 10 1 GOOG Buy 100
1 2013 10 1 XOM Buy 100
2 2013 10 4 SPY Buy 100
3 2013 10 6 SPY Buy 100
4 2013 10 6 AAPL Buy 100
5 2013 10 8 GOOG Buy 100
我希望这会让它更清晰。
答案 0 :(得分:0)
该pandas操作的基础称为stack
:
df.stack()
Out[25]:
2013-10-1-16:00:00 GOOG 1
XOM 1
2013-10-4-16:00:00 SPY 1
2013-10-6-16:00:00 SPY 1
AAPL 1
2013-10-8-16:00:00 GOOG 1
从上面的堆叠数据框中处理和调整数据非常简单。您可以按照重置索引,将其拆分为年月日列,然后对现在单列中的非NaN数据应用数学。