用熊猫唯一引用填充数据组

时间:2019-10-29 14:03:59

标签: pandas pandas-groupby

我的发票数据如下

+----------------+-----------+-------------+-----+-------+
|       ID       |   Date    | Description | QTY | Price |
+----------------+-----------+-------------+-----+-------+
| 1XpP1          | 08-Feb-19 | A           |   1 |     8 |
| Total [INV001] |           |             |   8 |     8 |
| 1XpQ1          | 08-Feb-19 | A           |   1 |    10 |
| 1XpQ1          | 08-Feb-19 | B           |   1 |    10 |
| Total [INV002] |           |             |   2 |    20 |
| 1XpP1          | 08-Feb-19 | A           |   1 |    12 |
| 1XpP1          | 08-Feb-19 | B           |   1 |    12 |
| 1XpP1          | 08-Feb-19 | C           |   1 |    12 |
| 1XpP1          | 08-Feb-19 | D           |   1 |    12 |
| Total [INV003] |           |             |   4 |    48 |
+----------------+-----------+-------------+-----+-------+

请注意每张发票下的Total行。其中包含invoice No。我想完全删除此行,并在差异列中分别添加Total参考。我想要的输出如下。

+-------+-----------+-------------+-----+-------+----------------+
|  ID   |   Date    | Description | QTY | Price |  ID Adjusted   |
+-------+-----------+-------------+-----+-------+----------------+
| 1XpP1 | 08-Feb-19 | A           |   1 |     8 | Total [INV001] |
| 1XpQ1 | 08-Feb-19 | A           |   1 |    10 | Total [INV002] |
| 1XpQ1 | 08-Feb-19 | B           |   1 |    10 | Total [INV002] |
| 1XpP1 | 08-Feb-19 | A           |   1 |    12 | Total [INV003] |
| 1XpP1 | 08-Feb-19 | B           |   1 |    12 | Total [INV003] |
| 1XpP1 | 08-Feb-19 | C           |   1 |    12 | Total [INV003] |
| 1XpP1 | 08-Feb-19 | D           |   1 |    12 | Total [INV003] |
+-------+-----------+-------------+-----+-------+----------------+

请给我一个开始解决这个问题的开始。我不知道如何对这些数据进行分组。随着发票上的行数变化。

1 个答案:

答案 0 :(得分:1)

想法是将Series.whereSeries.str.endswith结合使用,以查找不匹配的行的缺失值并回填它们,然后使用boolean indexing通过倒置掩码过滤,最后可以添加DataFrame.copy为避免SettingwithCopyWarning(如果稍后要处理数据):

m = df['ID'].str.endswith(']')
df['ID Adjusted'] = df['ID'].where(m).bfill()
df = df[~m].copy()
print (df)
      ID       Date Description  QTY  Price     ID Adjusted
0  1XpP1  08-Feb-19           A    1      8  Total [INV001]
2  1XpQ1  08-Feb-19           A    1     10  Total [INV002]
3  1XpQ1  08-Feb-19           B    1     10  Total [INV002]
5  1XpR1  08-Feb-19           A    1     12  Total [INV003]
6  1XpR1  08-Feb-19           B    1     12  Total [INV003]
7  1XpR1  08-Feb-19           C    1     12  Total [INV003]
8  1XpR1  08-Feb-19           D    1     12  Total [INV003]
相关问题