Pandas来自(可能)递归函数的新列

时间:2018-07-03 23:52:01

标签: python pandas

我有一些类似的数据:

labels = ["sku", "buildfrom", "factor", "quantity"]
records = [("pipe5", "pipe10", 2, 1),
("pipe10", "pipe20", 2, 4),
("pipe20", "pipe20", 1, 3)]

df = pd.DataFrame.from_records(records, columns=labels)

在此数据中, pipe5 是可用数量为1的产品,但是 pipe5 可以由 pipe10 制成,例如 buildfrom 列显示。并且 factor 列中的值为2表示 buildfrom 项可以构成2个单位的 sku 项。 我想创建一个名为“ can_make_qty”的列,并在其中填充我们可以使用的sku总量。 在这种情况下,“ can_make_qty”值将为:

    对于pipe20
  • 3 (因为buildfrom == sku的值,并且系数为1)
  • 管道10的
  • 10 (来自其库存的4管道+管道20的(2 * 3))
  • 管道5的
  • 21 (管道1的库存+管道10的(2 * 10))

我在这里看到了递归逻辑的一些用法,但是我不知道如何将其编码为一个函数,以便将结果添加到“ can_make_qty”中。

任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:0)

好吧,我对递归解决方案的了解到此为止,它可以与上述情况一起使用:

def canmake(row):
    final=[]
    final.append(row['quantity'])
    item = row['buildfrom']
    factor = row['factor']

    def recsearch(item, factor):
    # Since the buildfrom is different from item
    # I need to find the quantity of the item the can build the row['sku'] item
    # And multiply it by the factor from the row being analyzed
    # I have to search it in the *whole* DataFrame list
        qty = df[df['sku'] == item]['quantity'].values[0]

        final.append(qty*factor)
    # collect the new factor
        newfactor = df[df['sku'] == item]['factor'].values[0]
        factor *= newfactor
    # if the one searched has the same name for its sku and buildfrom then we don't
    # need to go any further, else do this:
        if (df[df['sku'] == item]['buildfrom'].values[0]) != (df[df['sku'] == item]['sku'].values[0]):

            item = df[df['sku'] == item]['buildfrom'].values[0]
            recsearch(item, factor)


    if row['buildfrom'] != row['sku']:
        recsearch(item, factor)

    return final # reduce(operator.add, final)

df['can_make_qty']=df.apply(canmake, axis=1)