基于其他列的Pandas复杂计算

时间:2017-06-02 13:45:45

标签: python pandas

我已经基于其他列的算法成功创建了新列,但现在我需要首先根据多列的匹配选择元素然后执行数学并且语法有问题。

ind_df_qtrly[10:16]
ticker  comp_name   per_fisc_year   per_type    per_fisc_qtr    per_end_date_x  sales   tot_revnu   zacks_x_sector_desc zacks_m_ind_desc    per_end_date_y  mkt_val
562170  AVX AVX CORP    2007    Q   1   2006-06-30  366.408 366.408 Computer and Technology ELECTRONICS 2008-12-31  1353.95
562215  AVX AVX CORP    2007    Q   2   2006-09-30  374.648 374.648 Computer and Technology ELECTRONICS 2008-12-31  1353.95
562260  AVX AVX CORP    2007    Q   3   2006-12-31  378.088 378.088 Computer and Technology ELECTRONICS 2008-12-31  1353.95
562305  AVX AVX CORP    2007    Q   4   2007-03-31  379.351 379.351 Computer and Technology ELECTRONICS 2008-12-31  1353.95
562350  AVX AVX CORP    2008    Q   1   2007-06-30  383.158 383.158 Computer and Technology ELECTRONICS 2008-12-31  1353.95
562395  AVX AVX CORP    2008    Q   2   2007-09-30  400.706 400.706 Computer and Technology ELECTRONICS 2008-12-31  1353.95

我正在尝试计算从2008年第二季度开始的“tot_revnu”列的尾随12个月值,这意味着每个“自动收报机”我需要在DF中创建一个新列,它将总结4个季度的结果(2008年-Q2,2008-Q1,2007-Q4,& 2007-Q3)。我使用年度数据计算增长也做了同样的事情,但这项工作证明非常具有挑战性。

我能够像这样使用年度数据

        # Annual Create the Pivot Table
        ind_table=pd.pivot_table(ind_df_annual, values='tot_revnu',index=['ticker','comp_name','zacks_x_sector_desc','zacks_m_ind_desc','mkt_val'],columns=['per_fisc_year'],aggfunc=np.sum)

        # Annual calculate Simple and FD Growth and add columns to PivotTable
        ind_table[str(prev_year)+'-Ann Simple']= (ind_table[(prev_year)]  - ind_table[(prev2_year)])/ ind_table[(prev2_year)]
        ind_table[str(current_year)+'-Ann Simple']= (ind_table[current_year]  - ind_table[prev_year])/ ind_table[prev_year]
        ind_table[str(current_year)+'-Ann FD']= (ind_table[(str(current_year))+'-Ann Simple']  - ind_table[(str(prev_year))+'-Ann Simple'])/ ind_table[(str(prev_year))+'-Ann Simple']

1 个答案:

答案 0 :(得分:0)

您可以使用滚动窗口。像这样:

ind_df_qtrly['cumulative_tot_revnu'] = ind_df_qtrly['tot_revnu'].rolling(4).sum()

文档:pandas.DataFrame.rolling