根据前一行填充列,并加上

时间:2018-02-14 22:21:16

标签: python pandas pandas-groupby

我正在为熊猫问题挣扎。我有以下数据。

+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
| symbol | side | status  | origQty | executedQty |     qty     | availableQty |   price    | boughtValue | soldValue  | dcaLevel |
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
| DGDBTC | BUY  | FILLED  |   0.125 |  0.12500000 |  0.12500000 |   0.12500000 | 0.02000700 |  0.00250088 |            |          |
| DGDBTC | BUY  | FILLED  |   0.125 |  0.12500000 |  0.12500000 |   0.25000000 | 0.01960100 |  0.00245013 |            |          |
| DGDBTC | SELL | FILLED  |    0.25 |  0.25000000 | -0.25000000 |   0.00000000 | 0.02005900 |             | 0.00501475 |          |
| DGDBTC | BUY  | FILLED  |   0.113 |  0.11300000 |  0.11300000 |   0.11300000 | 0.02203000 |  0.00248939 |            |          |
| DGDBTC | BUY  | FILLED  |   0.113 |  0.11300000 |  0.11300000 |   0.22600000 | 0.02160300 |  0.00244114 |            |          |
| DGDBTC | BUY  | EXPIRED |   0.226 |  0.00000000 |  0.00000000 |              | 0.02125500 |             |            |          |
| DGDBTC | BUY  | PARTIAL |   0.226 |  0.15800000 |  0.15800000 |   0.38400000 | 0.02126100 |             |            |          |
| DGDBTC | SELL | EXPIRED |   0.384 |  0.00000000 |  0.00000000 |              | 0.02196600 |             |            |          |
| DGDBTC | SELL | EXPIRED |   0.384 |  0.00000000 |  0.00000000 |              | 0.02214300 |             |            |          |
| DGDBTC | SELL | EXPIRED |   0.384 |  0.00000000 |  0.00000000 |              | 0.02189900 |             |            |          |
| DGDBTC | BUY  | FILLED  |   0.384 |  0.38400000 |  0.38400000 |   0.76800000 | 0.02082900 |  0.00799834 |            |          |
| DGDBTC | BUY  | FILLED  |   0.768 |  0.76800000 |  0.76800000 |   1.53600000 | 0.01984300 |  0.01523942 |            |          |
| DGDBTC | SELL | EXPIRED |   1.536 |  0.00000000 |  0.00000000 |              | 0.02074400 |             |            |          |
| DGDBTC | SELL | EXPIRED |   1.536 |  0.00000000 |  0.00000000 |              | 0.02094100 |             |            |          |
| DGDBTC | SELL | EXPIRED |   1.536 |  0.00000000 |  0.00000000 |              | 0.02076800 |             |            |          |
| DGDBTC | SELL | PARTIAL |   1.536 |  0.30300000 | -0.30300000 |   1.23300000 | 0.02065000 |             |            |          |
| DGDBTC | SELL | FILLED  |   1.233 |  1.23300000 | -1.23300000 |   0.00000000 | 0.02070000 |             | 0.02552310 |          |
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+

这是符号数据组的子集。对于每个符号,我想用最符合以下规则的值填充最后一列:

  1. 系列中的每个买单(side = BUY)的值为零(0)。
  2. 对于每个连续的买单,价值增加 一(1)。
  3. 当达到卖单(卖方=卖出)时,它会标记新的买入订单系列。
  4. 跳过状态为EXPIRED的行。
  5. 示例:

    +--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
    | symbol | side | status  | origQty | executedQty |     qty     | availableQty |   price    | boughtValue | soldValue  | dcaLevel |
    +--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
    | DGDBTC | BUY  | FILLED  |   0.125 |  0.12500000 |  0.12500000 |   0.12500000 | 0.02000700 |  0.00250088 |            |        0 |
    | DGDBTC | BUY  | FILLED  |   0.125 |  0.12500000 |  0.12500000 |   0.25000000 | 0.01960100 |  0.00245013 |            |        1 |
    | DGDBTC | SELL | FILLED  |    0.25 |  0.25000000 | -0.25000000 |   0.00000000 | 0.02005900 |             | 0.00501475 |          |
    | DGDBTC | BUY  | FILLED  |   0.113 |  0.11300000 |  0.11300000 |   0.11300000 | 0.02203000 |  0.00248939 |            |        0 |
    | DGDBTC | BUY  | FILLED  |   0.113 |  0.11300000 |  0.11300000 |   0.22600000 | 0.02160300 |  0.00244114 |            |        1 |
    | DGDBTC | BUY  | EXPIRED |   0.226 |  0.00000000 |  0.00000000 |              | 0.02125500 |             |            |          |
    | DGDBTC | BUY  | PARTIAL |   0.226 |  0.15800000 |  0.15800000 |   0.38400000 | 0.02126100 |             |            |        2 |
    | DGDBTC | SELL | EXPIRED |   0.384 |  0.00000000 |  0.00000000 |              | 0.02196600 |             |            |          |
    | DGDBTC | SELL | EXPIRED |   0.384 |  0.00000000 |  0.00000000 |              | 0.02214300 |             |            |          |
    | DGDBTC | SELL | EXPIRED |   0.384 |  0.00000000 |  0.00000000 |              | 0.02189900 |             |            |          |
    | DGDBTC | BUY  | FILLED  |   0.384 |  0.38400000 |  0.38400000 |   0.76800000 | 0.02082900 |  0.00799834 |            |        3 |
    | DGDBTC | BUY  | FILLED  |   0.768 |  0.76800000 |  0.76800000 |   1.53600000 | 0.01984300 |  0.01523942 |            |        4 |
    | DGDBTC | SELL | EXPIRED |   1.536 |  0.00000000 |  0.00000000 |              | 0.02074400 |             |            |          |
    | DGDBTC | SELL | EXPIRED |   1.536 |  0.00000000 |  0.00000000 |              | 0.02094100 |             |            |          |
    | DGDBTC | SELL | EXPIRED |   1.536 |  0.00000000 |  0.00000000 |              | 0.02076800 |             |            |          |
    | DGDBTC | SELL | PARTIAL |   1.536 |  0.30300000 | -0.30300000 |   1.23300000 | 0.02065000 |             |            |          |
    | DGDBTC | SELL | FILLED  |   1.233 |  1.23300000 | -1.23300000 |   0.00000000 | 0.02070000 |             | 0.02552310 |          |
    +--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
    

    我尝试了以下两种方式。

    merged_df['dcaLevel'] = merged_df[(merged_df['side'] == 'BUY') & (merged_df['status'].isin(['FILLED', 'PARTIAL']))].groupby(['symbol'])['dcaLevel'].apply(lambda x: x.shift(1) + 1)
    

    这种方式会引发错误。

    merged_df['dcaLevel'] = merged_df[(merged_df['side'] == 'BUY') & (merged_df['status'].isin(['FILLED', 'PARTIAL']))].groupby(['symbol'])['dcaLevel'].apply(lambda x: 0 if x.shift(1) else x.shift(1) + 1)
    

    我尝试了以下替代方法。

    symbol_df = merged_df.loc[merged_df['symbol'] == 'DGDBTC']
    tmp_df = symbol_df[(symbol_df['side'] == 'BUY') & (symbol_df['status'].isin(['FILLED', 'PARTIAL']))]
    tmp_df['dcaLevel'] = np.where(tmp_df['availableQty'] < tmp_df['availableQty'].shift(1), 0, tmp_df['dcaLevel'].shift(1) + 1)
    

    它适用于某些行,而不适用于其他行,而系列中的第一个购买订单仍为NaN。

    我编写了以下代码,但是我确信使用Pandas可以更轻松地完成此操作。

    merged_df['dcaLevel'] = np.NaN
    grouped = merged_df[merged_df['status'].isin(['FILLED', 'PARTIAL'])].groupby(['symbol'])
    col_idx = merged_df.columns.get_loc('dcaLevel')
    for name, group in grouped:
        first = True
        for index, row in group.iterrows():
            if row['side'] == 'SELL':
                first = True
                dca_level = np.NaN
            else:
                if first:
                    first = False
                    dca_level = 0
                else:
                    dca_level = dca_level + 1
                merged_df.iloc[index, col_idx] = dca_level
    merged_df[merged_df['symbol'] == 'DGDBTC']
    

    我希望有人可以帮助解决这个问题。

0 个答案:

没有答案