一次性在Pandas面板中的项目之间添加列?

时间:2017-03-18 16:54:01

标签: pandas dataframe panel

我有以下小组名为stocks

Dimensions: 2 (items) x 1681 (major_axis) x 5 (minor_axis)
Items axis: AAPL to OPK
Major_axis axis: 2010-01-04 00:00:00 to 2016-09-07 00:00:00
Minor_axis axis: Open to Volume

Items轴包含股票的名称,Minor_axis包含特定股票的许多属性列,例如' Open'' Close,' Volume&# 39;等

我尝试同时向所有商品(股票)添加名为[' Log_Return']的新列(属性)。

我已经尝试了以下内容及其变体,但似乎没有对我的面板进行任何更改

stocks[:]['Log_Return'] = np.log( stocks.loc[:,:, 'Close'] / stocks.loc[:,:, 'Close'].shift(1)) 
#This created an additional item instead of a column in minor_axis

stocks['AAPL':'OPK']['Log_Return'] = np.log( stocks.loc[:,:, 'Close'] / stocks.loc[:,:, 'Close'].shift(1))

 #This didn't do anything; no errors but no changes being made to my panels as well


# AAPL and OPK are the only stocks in the items axis and are equivalent to the ':', in the right 
hand side of the equation. 

我也试过使用迭代

for i in stocks:
    stocks.i['Log_Return']= np.log( stocks.loc[i,:, 'Close'] /stocks.loc[i,: ,'Close'])

并收到此错误'Panel' object has no attribute 'i'

stocks.AAPL
              Open    High     Low   Close       Volume  
Date                                                          
2010-01-04   30.49   30.64   30.34   30.57  123432050.0      
2010-01-05   30.66   30.80   30.46   30.63  150476004.0      
2010-01-06   30.63   30.75   30.11   30.14  138039594.0      
2010-01-07   30.25   30.29   29.86   30.08  119282324.0      
2010-01-08   30.04   30.29   29.87   30.28  111969081.0      
2010-01-11   30.40   30.43   29.78   30.02  115557365.0      


stocks.OPK
             Open   High    Low  Close     Volume
Date                                             
2010-01-04   1.80   1.97   1.76   1.95   234455.0
2010-01-05   1.64   1.95   1.64   1.93   135712.0
2010-01-06   1.90   1.92   1.77   1.79   546586.0
2010-01-07   1.79   1.94   1.76   1.92   138622.0
2010-01-08   1.92   1.94   1.86   1.89    62425.0
2010-01-11   1.90   1.95   1.89   1.91   130195.0

我觉得我犯了一个简单的错误,但今天我无法直接思考。

1 个答案:

答案 0 :(得分:2)

一种解决方案是迭代items面板对象。在您的情况下,迭代stocks.items而不是stocks。这是一个一般的例子。

import pandas as pd
import numpy as np
p = pd.Panel(np.random.randn(2, 5, 2), items=['Item1', 'Item2'],major_axis=pd.date_range('1/1/2000', periods=5), minor_axis=['A', 'B'])
for i in p.items:
    p[i]['C'] =  p[i]['A']+p[i]['B']
print p['Item1']
print p['Item2']

# reindex to new minor axis
p = p.reindex_axis(['A', 'B', 'C'], 'minor_axis')
print p.minor_axis

results in 

                   A         B         C
2000-01-01 -0.442373  0.842567  0.400194
2000-01-02  0.668583  1.809871  2.478454
2000-01-03  0.979304  1.022991  2.002295
2000-01-04  0.910955  0.282959  1.193914
2000-01-05  1.265542 -1.626789 -0.361247
                   A         B         C
2000-01-01 -0.635350 -0.138817 -0.774166
2000-01-02 -0.573246  0.731871  0.158625
2000-01-03 -0.027341  1.033315  1.005974
2000-01-04 -1.152284  0.210650 -0.941634
2000-01-05 -0.504819  0.682751  0.177933

Index([u'A', u'B', u'C'], dtype='object')