熊猫日期偏移量-分组依据,并显示下周和下个月的值

时间:2019-02-11 18:09:56

标签: python pandas pandas-groupby

我正在尝试将两两列添加到Pandas DataFrame中-一列代表接下来的几周值,另一列代表接下来的4周之和。

可以在下面看到现有DataFrame的示例。下面的DataFrame只是整个DataFrame的简短摘要,跨越了很多年。下面的DataFrame是使用以下函数得出的:<?php namespace App\Jobs; use App\Mail\testMail; use App\User; use Illuminate\Bus\Queueable; use Illuminate\Queue\SerializesModels; use Illuminate\Queue\InteractsWithQueue; use Illuminate\Contracts\Queue\ShouldQueue; use Illuminate\Foundation\Bus\Dispatchable; use Illuminate\Support\Facades\Mail; class SendRegistrationMailJob implements ShouldQueue { use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; /** * @var User */ private $user; /** * Create a new job instance. * * @param User $user */ public function __construct(User $user) { $this->user = $user; } /** * Execute the job. * * @return void */ public function handle() { Mail::to($this->user->getAttribute('email'))->send(new testMail()); } }

df = df.groupby([pd.Grouper(key='date', freq='W'), pd.Grouper('company_name').agg({'returns': 'sum'})

所需的输出将返回下一周的值,以及从“日期”列开始的下4周的总和。所需输出的示例如下所示。

date         company_name    returns
2014-12-07	Amazon        -0.5
2014-12-14	Amazon        -0.1
2014-12-21	Amazon        0.5
2014-12-28	Amazon        0.3
2015-01-04	Amazon        0.1
2014-12-07	Facebook      0.5
2014-12-14	Facebook      0.5
2014-12-21	Facebook      0.5
2014-12-28	Facebook      -0.5
2015-01-04	Facebook      -0.5
2014-12-07	Google        0.1
2014-12-14	Google        0.1
2014-12-21	Google        0.1
2014-12-28	Google        0.1
2015-01-04	Google        0.1
2014-12-07	Intel         0.2
2014-12-14	Intel         0.2
2014-12-21	Intel         0.2
2014-12-28	Intel         0.2
2015-01-04	Intel         0.2

下面可以看到原始CSV的代码段。

date         company_name    returns  next_week_return  next_month_return
2014-12-07	Amazon        -0.5        -0.5              0.8
2014-12-14	Amazon        -0.1        0.5               0.8
2014-12-21	Amazon        0.5         0.3               0.8
2014-12-28	Amazon        0.3         0.1               0.8
2015-01-04	Amazon        0.1         0.1               ...           
2014-12-07	Facebook      0.5         0.5               0.0               
2014-12-14	Facebook      0.5         0.5               0.0
2014-12-21	Facebook      0.5         -0.5              0.0
2014-12-28	Facebook      -0.5        -0.5              0.0
2015-01-04	Facebook      -0.5        0.1               ...
2014-12-07	Google        0.1         0.1               0.4
2014-12-14	Google        0.1         0.1               0.4
2014-12-21	Google        0.1         0.1               0.4
2014-12-28	Google        0.1         0.1               0.4
2015-01-04	Google        0.1         0.1               ...
2014-12-07	Intel         0.2         0.2               0.8
2014-12-14	Intel         0.2         0.2               0.8
2014-12-21	Intel         0.2         0.2               0.8
2014-12-28	Intel         0.2         0.2               0.8
2015-01-04	Intel         0.2         0.2

从上面我希望在每一行中增加两列-date CompanyName return 07/12/2014 8x8 Inc -0.0038835 14/12/2014 8x8 Inc 0.036923354 21/12/2014 8x8 Inc 0.108854405 28/12/2014 8x8 Inc 0.042793145 04/01/2015 8x8 Inc -0.027219971 11/01/2015 8x8 Inc -0.038249882 18/01/2015 8x8 Inc 0.045946457 25/01/2015 8x8 Inc -0.107796707 01/02/2015 8x8 Inc -0.056725981 08/02/2015 8x8 Inc 0.024344572 15/02/2015 8x8 Inc 0.00756624 22/02/2015 8x8 Inc -0.04365263 01/03/2015 8x8 Inc -0.02794593 08/03/2015 8x8 Inc -0.039922714 15/03/2015 8x8 Inc 0.020848566 22/03/2015 8x8 Inc 0.116712617 29/03/2015 8x8 Inc 0.028952565 05/04/2015 8x8 Inc 0.053253322 12/04/2015 8x8 Inc -0.006787356 19/04/2015 8x8 Inc -0.00912207 26/04/2015 8x8 Inc 0.013652089 03/05/2015 8x8 Inc -0.021702736 10/05/2015 8x8 Inc -0.021004273 17/05/2015 8x8 Inc 0.012888286 24/05/2015 8x8 Inc -0.021177262 31/05/2015 8x8 Inc -0.027630051 07/12/2014 AB SA -1.015859196 14/12/2014 AB SA -0.01810143 21/12/2014 AB SA -0.073869849 28/12/2014 AB SA 0.000666445 04/01/2015 AB SA 0.051293294 11/01/2015 AB SA 0.004735605 18/01/2015 AB SA 0.014073727 25/01/2015 AB SA 0.097002705 01/02/2015 AB SA 0.00337648 08/02/2015 AB SA 0.018093743 15/02/2015 AB SA 0.019667392 22/02/2015 AB SA 0.024844339 01/03/2015 AB SA 0.015707129 08/03/2015 AB SA 0.109611209 15/03/2015 AB SA -0.039164849 22/03/2015 AB SA -0.002909093 29/03/2015 AB SA 0.007256926 05/04/2015 AB SA -0.025385791 12/04/2015 AB SA 0.019584469 19/04/2015 AB SA -0.01342302 26/04/2015 AB SA 0.073405725 03/05/2015 AB SA -0.018666287 10/05/2015 AB SA 0.019350984 17/05/2015 AB SA -0.030814439 24/05/2015 AB SA 0.027386256 31/05/2015 AB SA -0.033285978 07/12/2014 ACCO Brands Corp 0.432332004 14/12/2014 ACCO Brands Corp -0.064822249 21/12/2014 ACCO Brands Corp 0.010163837 28/12/2014 ACCO Brands Corp 0.022223137 04/01/2015 ACCO Brands Corp -0.034659702 11/01/2015 ACCO Brands Corp -0.026514522 18/01/2015 ACCO Brands Corp -0.018868484 25/01/2015 ACCO Brands Corp 0.013010237 01/02/2015 ACCO Brands Corp -0.071850737 08/02/2015 ACCO Brands Corp 0.00126183 15/02/2015 ACCO Brands Corp -0.016000601 22/02/2015 ACCO Brands Corp -0.01420295 01/03/2015 ACCO Brands Corp -0.010457612 08/03/2015 ACCO Brands Corp -0.006591982 15/03/2015 ACCO Brands Corp -0.008257798 22/03/2015 ACCO Brands Corp 0.039272062 29/03/2015 ACCO Brands Corp 0.035312622 05/04/2015 ACCO Brands Corp 0.012315427 12/04/2015 ACCO Brands Corp 0.037241541 19/04/2015 ACCO Brands Corp -0.025075941 26/04/2015 ACCO Brands Corp -0.010535083 03/05/2015 ACCO Brands Corp -0.044016885 10/05/2015 ACCO Brands Corp -0.013845407 17/05/2015 ACCO Brands Corp 0.005056901 24/05/2015 ACCO Brands Corp -0.024251348 31/05/2015 ACCO Brands Corp -0.051701374 07/12/2014 Acer Inc 3.829777429 07/12/2014 Acer Inc -3.46435286 14/12/2014 Acer Inc 0.042160811 14/12/2014 Acer Inc 0.021342273 21/12/2014 Acer Inc -0.056618894 21/12/2014 Acer Inc -0.046304568 28/12/2014 Acer Inc 0.033415997 28/12/2014 Acer Inc 0.062759689 04/01/2015 Acer Inc 0.002344667 04/01/2015 Acer Inc -0.004460974 11/01/2015 Acer Inc 0.082988363 11/01/2015 Acer Inc 0.093933758 18/01/2015 Acer Inc -0.033983853 18/01/2015 Acer Inc -0.042689409 25/01/2015 Acer Inc 0.017136282 25/01/2015 Acer Inc -0.012539349 01/02/2015 Acer Inc 0.002424244 01/02/2015 Acer Inc 0.010980502 08/02/2015 Acer Inc -0.014634408 08/02/2015 Acer Inc -0.015723594 15/02/2015 Acer Inc -0.014851758 15/02/2015 Acer Inc 0.025040432 22/02/2015 Acer Inc 0 22/02/2015 Acer Inc 0.022919261 01/03/2015 Acer Inc 0.024631787 01/03/2015 Acer Inc -0.007581537 08/03/2015 Acer Inc 0.05445132 08/03/2015 Acer Inc 0.027028672 15/03/2015 Acer Inc -0.023311079 15/03/2015 Acer Inc -0.022472856 22/03/2015 Acer Inc -0.002361276 22/03/2015 Acer Inc 0 29/03/2015 Acer Inc -0.021506205 29/03/2015 Acer Inc 0.012048339 05/04/2015 Acer Inc -0.021978907 05/04/2015 Acer Inc -0.028109292 12/04/2015 Acer Inc -0.004950505 12/04/2015 Acer Inc 0.02756683 19/04/2015 Acer Inc -0.007472015 19/04/2015 Acer Inc 0.003016594 26/04/2015 Acer Inc 0.009950331 26/04/2015 Acer Inc 0.006006024 03/05/2015 Acer Inc -0.004962789 03/05/2015 Acer Inc 0.002989539 10/05/2015 Acer Inc -0.040614719 10/05/2015 Acer Inc -0.087282784 17/05/2015 Acer Inc -0.064193158 17/05/2015 Acer Inc -0.072605718 24/05/2015 Acer Inc 0.008253142 24/05/2015 Acer Inc -0.032031208 31/05/2015 Acer Inc 0.005464494 31/05/2015 Acer Inc 0.057961788,其中显示了特定公司接下来几周的回报;另一个:next_week_return,它是接下来四个星期的收益之和。

任何人都能提供的帮助将不胜感激。

1 个答案:

答案 0 :(得分:1)

从输入CSV的示例代码段开始,一种解决方案是编写一个自定义函数以与df.apply()一起使用,该函数接受每个公司的子DataFrame,并为子DataFrame中的每个日期计算在指定的提前天数内,return的总和。

以下代码假定df保留了原始CSV中的示例数据。

# Convert string dates to pandas.Timestamp 
df['date'] = pd.to_datetime(df['date'])

# Within each CompanyName, sort by date, because we'll
# set the date column as a DatetimeIndex and will
# index-slice it with pandas date offsets, and this
# requires a sorted index.
df.sort_values(['CompanyName', 'date'], inplace=True)

# Set a MultiIndex to ensure that the calculated
# columns returned by the custom function align correctly
df.set_index(['CompanyName', 'date'], inplace=True)    

# Define a custom function to sum the values of `return` for 
# each CompanyName sub-DataFrame. The defaults of 1 and 1+28
# capture the month (defined to be 28 days) immediately following
# each date, excluding the date itself. To get just the 
# next week's values, use start=1, end=7.
def sum_return_over_next_i_to_j_days(df, first=1, last=1+28):
    day = pd.offsets.Day(1)
    df.reset_index(level=0, drop=True, inplace=True)
    rets = [df.loc[today + first*day : today + last*day, 'return'].sum(min_count=1) 
            for today in df.index]
    return pd.DataFrame(rets, 
                        index=df.index, 
                        columns=[f'sum_return_next_{first}-{last}_days'])

# Apply the above function to input CSV
df['next_week_return'] = df.groupby('CompanyName').apply(sum_return_over_next_i_to_j_days, 1, 7)
df['next_month_return'] = df.groupby('CompanyName').apply(sum_return_over_next_i_to_j_days, 1, 1+28)

df = df.reset_index()

# Print result
df.head(10)
  CompanyName       date    return  next_week_return  next_month_return
0     8x8 Inc 2014-07-12 -0.003883               NaN                NaN
1     8x8 Inc 2014-12-14  0.036923          0.108854           0.066976
2     8x8 Inc 2014-12-21  0.108854          0.042793           0.004068
3     8x8 Inc 2014-12-28  0.042793         -0.084672          -0.146522
4     8x8 Inc 2015-01-02 -0.056726         -0.027946          -0.089796
5     8x8 Inc 2015-01-03 -0.027946               NaN          -0.061850
6     8x8 Inc 2015-01-18  0.045946         -0.107797          -0.100230
7     8x8 Inc 2015-01-25 -0.107797               NaN          -0.036086
8     8x8 Inc 2015-02-15  0.007566         -0.043653          -0.044507
9     8x8 Inc 2015-02-22 -0.043653               NaN           0.115858

df.tail(10)
    CompanyName       date    return  next_week_return  next_month_return
120    Acer Inc 2015-08-02 -0.014634           0.08148           0.081480
121    Acer Inc 2015-08-02 -0.015724           0.08148           0.081480
122    Acer Inc 2015-08-03  0.054451               NaN                NaN
123    Acer Inc 2015-08-03  0.027029               NaN                NaN
124    Acer Inc 2015-10-05 -0.040615               NaN           0.176922
125    Acer Inc 2015-10-05 -0.087283               NaN           0.176922
126    Acer Inc 2015-11-01  0.082988               NaN                NaN
127    Acer Inc 2015-11-01  0.093934               NaN                NaN
128    Acer Inc 2015-12-04 -0.004951               NaN                NaN
129    Acer Inc 2015-12-04  0.027567               NaN                NaN