Pandas - 循环列以进行分配优化

时间:2018-04-12 19:16:19

标签: python pandas

我正在制作一个程序,根据其排名为供应商分配数量。数据是这样的:

Origin  Dest    Provider    Vol_A       Vol_B       Vol_C   Capacity        rank
NYC     AMS     A           90          1300        2500        4000        1
NYC     AMS     B           150         600         1700        3000        2
NYC     BRI     A           105         700         100         2300        1
NYC     BRI     C           300         1300        200         2800        2

目标是将所有卷分配给排名第一的提供商,直到满足Capacity为止。在第一个例子中,NYC到AMS,提供者A将被分配240个单位Vol_A,因为90 + 150 = 240.所需的输出如下所示:

Origin  Dest    Provider    Vol_A   Vol_B   Vol_C   Capacity    rank    assg_a  assg_b  assg_c
NYC     AMS     A           90      1300    2500    4000        1       240     1900    1980
NYC     AMS     B           150     600     1700    3000        2       0       0       2220
NYC     BRI     A           105     700     100     2300        1       405     1895    0
NYC     BRI     C           300     1300    200     2800        2       0       105     300

在NYC-AMS示例中,提供商A无法填写所有Vol_C,因此其中一些溢出到提供商B.

我的代码如下:

def assign():
    vol_sim = pd.DataFrame(columns=['Origin','Dest','Provier','rank','Vol_E','Vol_C','Vol_U','Capacity','assg_e','assg_c','assg_u'])
    for key,lane in inputs.groupby(['Origin','Dest'])
        for col,out in zip(['Vol_E','Vol_C','Vol_U'],['assg_e','assg_c','assg_u']):
            to_assg = lane[col].sum()       
            assg = 0                        
            remain = to_assg               
            for idx,row in lane.iterrows():
                if assg >= to_assg:
                    row[out] = 0
                    row_temp = pd.DataFrame(row[['Origin','Dest','Provider','rank',col,'Capacity',out]])
                    row_temp2 = row_temp.T
                    vol_sim = vol_sim.append(row_temp2)
                else:
                    if row['Capacity'] <= to_assg:
                        row[out] = row['Capacity']
                        assg = assg + row[out]
                        remain = remain - assg
                        row_temp = pd.DataFrame(row[['Origin','Dest','Provider','rank',col,'Capacity',out]])
                        row_temp2 = row_temp.T
                        vol_sim = vol_sim.append(row_temp2)                            
                    else:
                        row[out] = remain
                        assg = assg + row[out]
                        remain = remain - assg
                        row_temp = pd.DataFrame(row[['Origin','Dest','Provider','rank',col,'Capacity',out]])
                        row_temp2 = row_temp.T
                        vol_sim = vol_sim.append(row_temp2)
    return vol_sim

我得到的结果(如下所示)似乎在列上的每次迭代都有重复。我希望我的结果与我想要的输出格式相同,而不是nans。我想我可以通过分组数据来做到这一点,但我宁愿在函数本身内完成它。有什么想法吗?

Origin  Dest    Provider    Vol_A   Vol_B   Vol_C   Capacity    rank    assg_a  assg_b  assg_c
NYC     AMS     A           90      1300    2500    4000        1       120     nan     nan
NYC     AMS     B           150     600     1700    3000        2       0       nan     nan
NYC     AMS     A           90      1300    2500    4000        1       nan     1900    nan
NYC     AMS     B           150     600     1700    3000        2       nan     0       nan
NYC     AMS     A           90      1300    2500    4000        1       nan     nan     1980
NYC     AMS     B           150     600     1700    3000        2       nan     nan     2220

0 个答案:

没有答案