在Python / Pandas中检查发票/帐单中是否存在多个项目

时间:2019-06-02 19:36:45

标签: python pandas

我下面是比尔数据示例。

enter image description here

+---------+----------+-----+--------+-------+-----+------+-----------+
| Bill No | totalamt | Loc | Item # | price | qty | type | ProdTotal |
+---------+----------+-----+--------+-------+-----+------+-----------+
|       1 |    10300 | S01 |    260 |  1500 |   3 | M    |      4500 |
|       1 |    10300 | S01 |    261 |  1500 |   2 | M    |      3000 |
|       1 |    10300 | S01 |     96 |   700 |   4 | 1    |      2800 |
|       2 |      540 | S02 |    260 |   140 |   1 | M    |       140 |
|       2 |      540 | S02 |    999 |    10 |   1 | 1    |        10 |
|       2 |      540 | S02 |    111 |   190 |   2 | M    |       380 |
|       2 |      540 | S02 |    888 |    10 |   1 | 1    |        10 |
|       3 |      150 | S02 |    222 |   140 |   1 | 1    |       140 |
|       3 |      150 | S02 |    999 |    10 |   1 | 1    |        10 |
|       4 |     4000 | S01 |   1054 |  1200 |   1 | M    |      1200 |
|       4 |     4000 | S01 |     96 |   700 |   1 | 1    |       700 |
|       4 |     4000 | S01 |     96 |   700 |   3 | 1    |      2100 |
|       5 |     3300 | S01 |    640 |  1200 |   1 | 1    |      1200 |
|       5 |     3300 | S01 |     96 |   700 |   3 | 1    |      2100 |
+---------+----------+-----+--------+-------+-----+------+-----------+

我需要检查在属于该条例草案的任何行的type列中是否存在类型“ M”(代表主要项目)。如果存在,则我应该在附加的with M列中有No M,如果没有Description,则应该为Description

我还需要计算M的出现次数和M的总数。

我想要的结果如下

enter image description here

列标题详细信息

  1. with M:-如果特定Bill No的任何一行在M列中有type,则应包含No M。如果不是# of M
  2. Bill No:-在单个Total M中没有M的发生
  3. qty:-将M的所有function filterOutEmpty($levelZero) { $hasProducts = []; foreach ($levelZero as $levelZeroK => $levelZeroV) { if (!empty($levelZeroV['products'])) { $hasProducts[$levelZeroK] = $levelZeroV; } else if ( is_array($levelZeroV)) { $new = filterOutEmpty($levelZeroV); if ( !empty($new) ) { $hasProducts[$levelZeroK] = $new; } } if (!empty($levelZeroV['families'])) { $new = filterOutEmpty($levelZeroV['families']); if ( !empty($new) ) { $hasProducts[$levelZeroK]['families'] = $new; } } } return $hasProducts; } print_r(filterOutEmpty($districts)); 值相加

1 个答案:

答案 0 :(得分:1)

这应该有效。

#names of the new columns we are going to create
newcols = ['Description', '# of M', 'Total M']

#function which will build the new columns
def addcols(x):
    nm = x['type'].str.contains('M').sum() #counts the M
    summ = x['qty'][x['type'] == 'M'].sum() #sums the 'qty' of the M
    if nm > 0:
        lab = 'with M'
    else:
        lab = 'No M'
    return pd.DataFrame([[lab, nm, summ] for _ in range(len(x))], columns=newcols)

descdf = df.groupby('Bill No').apply(addcols).reset_index()    
finaldf = pd.concat([df, descdf[newcols]], axis=1)

finaldf是:

    Bill No  totalamt  Loc  Item #  price  qty type  ProdTotal Description  # of M  Total M
0         1     10300  S01     260   1500    3    M       4500      with M       2        5
1         1     10300  S01     261   1500    2    M       3000      with M       2        5
2         1     10300  S01      96    700    4    1       2800      with M       2        5
3         2       540  S02     260    140    1    M        140      with M       2        3
4         2       540  S02     999     10    1    1         10      with M       2        3
5         2       540  S02     111    190    2    M        380      with M       2        3
6         2       540  S02     888     10    1    1         10      with M       2        3
7         3       150  S02     222    140    1    1        140        No M       0        0
8         3       150  S02     999     10    1    1         10        No M       0        0
9         4      4000  S01    1054   1200    1    M       1200      with M       1        1
10        4      4000  S01      96    700    1    1        700      with M       1        1
11        4      4000  S01      96    700    3    1       2100      with M       1        1
12        5      3300  S01     640   1200    1    1       1200        No M       0        0
13        5      3300  S01      96    700    3    1       2100        No M       0        0