Question

我目前在为以下DataFrame添加行时遇到了麻烦，我为六家公司的回报构建了这些行＆＃39;股票：

def importdata(data):

returns=pd.read_excel(data) # Imports the data from Excel
returns_with_dates=returns.set_index('Dates') # Sets the Dates as the df index

return returns_with_dates

输出：

Out[345]: 
        Company 1  Company 2   Company 3  Company 4  Company 5  Company 6
Dates                                                                           
1997-01-02  31.087620   3.094705   24.058686  31.694404  37.162890  13.462241   
1997-01-03  31.896592   3.109631   22.423629  32.064378  37.537013  13.511706   
1997-01-06  31.723241   3.184358   18.803148  32.681000  37.038183  13.684925   
1997-01-07  31.781024   3.199380   19.503886  33.544272  37.038183  13.660193   
1997-01-08  31.607673   3.169431   19.387096  32.927650  37.537013  13.585995   
1997-01-09  31.492106   3.199380   19.737465  33.420948  37.038183  13.759214   
1997-01-10  32.589996   3.184358   19.270307  34.284219  37.661721  13.858235   
1997-01-13  32.416645   3.199380   19.153517  35.147491  38.035844  13.660193   
1997-01-14  32.301077   3.184358   19.503886  35.517465  39.407629  13.783946   
1997-01-15  32.127726   3.199380   19.387096  35.887438  38.409967  13.759214   
1997-01-16  32.532212   3.229232   19.737465  36.257412  39.282921  13.635460   
1997-01-17  33.167833   3.259180   20.087835  37.490657  39.033505  13.858235   
1997-01-20  33.456751   3.229232   20.438204  35.640789  39.657044  14.377892   
1997-01-21  33.225616   3.244158   20.671783  36.010763  40.779413  14.179940   
1997-01-22  33.110049   3.289033   21.489312  36.010763  40.654705  14.254138   
1997-01-23  32.705563   3.199380   20.905363  35.394140  40.904121  14.229405   
1997-01-24  32.127726   3.139579   20.204624  35.764114  40.405290  13.957165   
1997-01-27  32.127726   3.094705   20.204624  35.270816  40.779413  13.882968   
1997-01-28  31.781024   3.079778   20.788573  34.407544  41.153536  13.684925   
1997-01-29  32.185510   3.094705   21.138942  34.654193  41.278244  13.858235   
1997-01-30  32.647779   3.094705   21.022153  34.407544  41.652367  13.981898   
1997-01-31  32.532212   3.064757   20.204624  34.037570  42.275905  13.858235

在无数个小时里，我尝试将它们总结起来，以便将1997-01-02至1997-01-08,1997-01-09至1997-01-15等行加起来，从而将前五行相加，然后是以下五行。此外，我试图将日期作为第5个元素的索引，所以在将1997-01-02到1997-01-08的元素相加的情况下，我试图将1997-01-08作为索引对应总结元素。值得一提的是，我一直在使用五行添加作为示例，但理想情况下，我试图将每n行添加，然后添加以下n行，同时按照先前所述的相同方式保持日期。我已经找到了一种方法 - 在下面的代码中显示 - 以数组的形式进行，但我不能在这种情况下保留日期。

returns=pd.read_excel(data) # Imports the data from Excel
returns_with_dates=returns.set_index('Dates') # Sets the Dates as the df index

returns_mat=returns_with_dates.as_matrix()
ndays=int(len(returns_mat)/n) # Number of "ndays" in our time-period

nday_returns=np.empty((ndays,min(np.shape(returns_mat)))) # Creates an empty array to fill
# and accommodate the n-day log-returns

for i in range(1,asset_number+1):
    for j in range(1,ndays+1):
        nday_returns[j-1,i-1]=np.sum(returns_mat[(n*j)-n:n*j,i-1])

return nday_returns

有没有办法这样做但是在DataFrame上下文中，同时按照我之前说过的方式保留日期？我一直试图这么做，没有任何成功，这真的让我很紧张！出于某种原因，每个人都发现熊猫非常有用且易于使用，但我碰巧发现它正好相反。任何形式的帮助将非常感谢。提前致谢。

Answer 1

<Custom Action="SetEnvironmentVariable"/>

groupby

按要求包含日期索引

df.groupby(np.arange(len(df)) // 5).sum()

Answer 2

如果你有相同数量的缺失日期，你可以根据你想要的天数resample。使用resample将日期保留在索引中。您还可以使用loffset参数来移动日期。

df.resample('7D', loffset='6D').sum()

                 Company 1  Company 2   Company 3   Company 4   Company 5  \
Dates                                                                   
1997-01-08  158.096150  15.757505  104.176445  162.911704  186.313282   
1997-01-15  160.927550  15.966856   97.052271  174.257561  190.553344   
1997-01-22  165.492461  16.250835  102.424599  181.410384  199.407588   
1997-01-29  160.927549  15.608147  103.242126  175.490807  204.520604   
1997-02-05   65.179991   6.159462   41.226777   68.445114   83.928272   

            Company 6  
Dates                  
1997-01-08  67.905060  
1997-01-15  68.820802  
1997-01-22  70.305665  
1997-01-29  69.612698  
1997-02-05  27.840133

很难在pandas DataFrame中添加元素

2 个答案: