Question

我有这样的债券市场数据：

Id   row      Date       BuyPrice    SellPrice    Time
1    1      2017-10-30    94520       0          9:00:00
1    2      2017-10-30    94538       0          9:00:00
1    3      2017-10-30    94609       0          9:00:00
1    4      2017-10-30    94615       0          9:00:00
1    5      2017-10-30    94617       0          9:00:00
1    1      2017-09-20    99100       99159      9:00:10
1    2      2017-09-20    99102       99058      9:00:11
1    3      2017-09-20    99103       99057      9:00:12
1    4      2017-09-20    99104       99056      9:00:10
1    5      2017-09-20    99105       99055      9:00:10
1    1      2017-09-20    98100       99190      9:01:10
1    2      2017-09-20    98099       99091      9:01:10
1    3      2017-09-20    98098       99092      9:01:10
1    4      2017-09-20    98097       99093      9:01:10
1    5      2017-09-20    98096       99094      9:01:10
2    1      2010-11-01    99890       100000     10:00:02
2    2      2010-11-01    99899       100000     10:00:02
2    3      2010-11-01    99901       99899      9:00:02
2    4      2010-11-01    99920       99850      10:00:02
2    5      2010-11-01    99933       99848      10:00:23

第1步：

我想为每天的每个id计算第一行的点差（= SellPrice - BuyPrice），如果BuyPrice或SellPrice中存在零，则排除零（此类数据报告为nan），数据在此步骤应该是这样的：

id     row      Date         BuyPrice      SellPrice     Spread
1      1        2017-10-30   94520         0             NaN
1      1        2017-09-20   99100         99159         59
1      1        2017-09-20   98100         99190         190
2      1        2010-11-01   99890         100000        110

第2步：

现在我想计算每个id每天的Spread平均值，并给出日期的索引值

最后数据应该是这样的：

Id    Date        avg.spread(average of spread for each day)   index
1     2017-10-30   NaN                                           1
1     2017-09-20   124.5(=(59+190)/2)                            2
2     2010-11-01   110                                           1

Answer 1

我尽力了解你想要的东西，虽然你没有明确提到它，但我想你想groupby Id，row， date和 g = df.assign(diff=df.SellPrice.sub(df.BuyPrice))\ .groupby(['Id', 'row', 'Date']).diff.mean() v = g.groupby(level=[0, 1]).cumcount().add(1).values df = g.reset_index().assign(index=v) df Id row Date diff index 0 1 1 2017-09-20 574.5 1 1 1 1 2017-10-30 NaN 2 2 1 2 2017-09-20 474.0 1 3 1 2 2017-10-30 NaN 2 4 1 3 2017-09-20 474.0 1 5 1 3 2017-10-30 NaN 2 6 1 4 2017-09-20 474.0 1 7 1 4 2017-10-30 NaN 2 8 1 5 2017-09-20 474.0 1 9 1 5 2017-10-30 NaN 2 10 2 1 2010-11-01 110.0 1 11 2 2 2010-11-01 101.0 1 12 2 3 2010-11-01 -2.0 1 13 2 4 2010-11-01 -70.0 1 14 2 5 2010-11-01 -85.0 1。

{{1}}

如何计算每天大熊猫的报价差价？

1 个答案: