如何计算每天大熊猫的报价差价?

时间:2017-10-27 20:53:31

标签: python pandas data-analysis

我有这样的债券市场数据:

Id   row      Date       BuyPrice    SellPrice    Time
1    1      2017-10-30    94520       0          9:00:00
1    2      2017-10-30    94538       0          9:00:00
1    3      2017-10-30    94609       0          9:00:00
1    4      2017-10-30    94615       0          9:00:00
1    5      2017-10-30    94617       0          9:00:00
1    1      2017-09-20    99100       99159      9:00:10
1    2      2017-09-20    99102       99058      9:00:11
1    3      2017-09-20    99103       99057      9:00:12
1    4      2017-09-20    99104       99056      9:00:10
1    5      2017-09-20    99105       99055      9:00:10
1    1      2017-09-20    98100       99190      9:01:10
1    2      2017-09-20    98099       99091      9:01:10
1    3      2017-09-20    98098       99092      9:01:10
1    4      2017-09-20    98097       99093      9:01:10
1    5      2017-09-20    98096       99094      9:01:10
2    1      2010-11-01    99890       100000     10:00:02
2    2      2010-11-01    99899       100000     10:00:02
2    3      2010-11-01    99901       99899      9:00:02
2    4      2010-11-01    99920       99850      10:00:02
2    5      2010-11-01    99933       99848      10:00:23

第1步:

我想为每天的每个id计算第一行的点差(= SellPrice - BuyPrice),如果BuyPrice或SellPrice中存在零,则排除零(此类数据报告为nan),数据在此步骤应该是这样的:

id     row      Date         BuyPrice      SellPrice     Spread
1      1        2017-10-30   94520         0             NaN
1      1        2017-09-20   99100         99159         59
1      1        2017-09-20   98100         99190         190
2      1        2010-11-01   99890         100000        110

第2步:

现在我想计算每个id每天的Spread平均值,并给出日期的索引值

最后数据应该是这样的:

Id    Date        avg.spread(average of spread for each day)   index
1     2017-10-30   NaN                                           1
1     2017-09-20   124.5(=(59+190)/2)                            2
2     2010-11-01   110                                           1

1 个答案:

答案 0 :(得分:2)

我尽力了解你想要的东西,虽然你没有明确提到它,但我想你想groupby Idrow date g = df.assign(diff=df.SellPrice.sub(df.BuyPrice))\ .groupby(['Id', 'row', 'Date']).diff.mean() v = g.groupby(level=[0, 1]).cumcount().add(1).values df = g.reset_index().assign(index=v) df Id row Date diff index 0 1 1 2017-09-20 574.5 1 1 1 1 2017-10-30 NaN 2 2 1 2 2017-09-20 474.0 1 3 1 2 2017-10-30 NaN 2 4 1 3 2017-09-20 474.0 1 5 1 3 2017-10-30 NaN 2 6 1 4 2017-09-20 474.0 1 7 1 4 2017-10-30 NaN 2 8 1 5 2017-09-20 474.0 1 9 1 5 2017-10-30 NaN 2 10 2 1 2010-11-01 110.0 1 11 2 2 2010-11-01 101.0 1 12 2 3 2010-11-01 -2.0 1 13 2 4 2010-11-01 -70.0 1 14 2 5 2010-11-01 -85.0 1

{{1}}