大熊猫的求和:将两个DataFrame与变量合并

时间:2019-12-05 15:09:25

标签: python pandas dataframe merge cumsum

我正在尝试创建从库存数据框到订单数据框的库存分配,当库存数据框很简单时我就可以做到,但是当库存数据框每个SKU具有多个类别(类型)时,分配给我做得不好。 SKU相同,但是现在它们有两种类型,因此,一个订单可以用两种类型的SKU完成。 从带有类型的分配数据框中可以看到,当存在两种类型的股票时,订单行会重复

问题

import pandas as pd
import numpy as np

# initilize list
orders= {'order':[20001, 20002, 20003, 20004, 20005], 'SKU' :['A1', 'A2', 'A3','A4', 'A4'], 
           'Request':[20, 21, 19, 18, 20]} 
stock = {'SKU': ['A1', 'A2','A3', 'A4'], 
         'Available' : [10, 20, 5, 28],
         'Type' : ['Stock', 'Stock', 'Stock', 'Stock']} 
stock2 = {'SKU': ['A1', 'A2', 'A2', 'A3','A3', 'A4'], 
         'Type' : ['Stock', 'Stock', 'To make', 'Stock', 'To make', 'Stock'],
         'Available' : [10, 20, 5, 30, 5, 8]} 
# Create DataFrames
df_ped = pd.DataFrame(orders) 
df_st = pd.DataFrame(stock)
df_st2 = pd.DataFrame(stock2)

简单的库存分配

#Merge
df_ped_st = pd.merge(df_ped, df_st, how='left', left_on='SKU', right_on='SKU')

#assignment
df_ped_st['difference'] = df_ped_st['Available'] - df_ped_st.groupby('SKU')['Request'].cumsum()
df_ped_st['balance'] = df_ped_st['difference'].clip(lower=0).shift(0).fillna(0)
df_ped_st['assig'] = np.where((df_ped_st['Request'] + df_ped_st['difference']) > 
                                   df_ped_st['Request'], df_ped_st['Request'], 
                                   (df_ped_st['Request'] + df_ped_st['difference']).clip(lower=0))
#Print
df_ped_st

结果确定

    order   SKU Request     Available   Type    difference  balance assig
0   20001   A1  20          10          Stock   -10         0       10
1   20002   A2  21          20          Stock   -1          0       20
2   20003   A3  19           5          Stock   -14         0       5
3   20004   A4  18          28          Stock   10          10      18
4   20005   A4  20          28          Stock   -10         0       10

类型的库存分配

#merge
df_ped_st2 = pd.merge(df_ped, df_st2, how='left', left_on='SKU', right_on='SKU')

#assignment
df_ped_st2['difference'] = df_ped_st2['Available'] - df_ped_st2.groupby('SKU')['Request'].cumsum()
df_ped_st2['`balance'] = df_ped_st2['difference'].clip(lower=0).shift(0).fillna(0)
df_ped_st2['assig'] = np.where((df_ped_st2['Request'] + df_ped_st2['difference']) > 
                                   df_ped_st2['Request'], df_ped_st2['Request'], 
                                   (df_ped_st2['Request'] + df_ped_st2['difference']).clip(lower=0))
#Print
df_ped_st2

结果错误

Pedido      SKU Request   Type       Available   difference balance assign
0   20001   A1  20        Stock      10         -10         0       10
1   20002   A2  21        Stock      20         -1          0       20
2   20002   A2  21        To make     5         -37         0       0
3   20003   A3  19        Stock      30          11         11      19
4   20003   A3  19        To make     5         -33         0       0
5   20004   A4  18        Stock       8         -10         0       8
6   20005   A4  20        Stock       8         -30         0       0

预期结果

Pedido      SKU Request   Type       Available   difference balance assign
0   20001   A1  20        Stock      10         -10         0       10
1   20002   A2  21        Stock      20         -1          0       20
2   20002   A2  21        To make     5          4          4       1
3   20003   A3  19        Stock      30          11         11      19
4   20003   A3  19        To make     5          5          5       0
5   20004   A4  18        Stock       8         -10         0       8
6   20005   A4  20        Stock       8         -20         0       0

0 个答案:

没有答案