需要帮助将此功能应用于Pandas数据框列

时间:2018-11-29 02:59:04

标签: python pandas dataframe anaconda

我正在尝试通过查找sku的父asin并计算满足某些条件的行来查找我的allinv_styles数据框中的产品类别,但我不知道我在做什么,非常感谢您帮助。

我收到错误消息:“ ValueError:只能将大小为1的数组转换为Python标量。”

我有两个数据框adgroups_df和allinv_styles。

adgroups_df有一个名为“广告组”的列,其中包含产品的sku。

SKU特定于产品的样式和尺寸。像黑色的小。父asin可以具有许多skus和样式。我正在尝试编写一个函数来计算广告组代表的样式的缺货百分比。

我的思考过程是:

  • 找到广告组的父项
  • id广告组样式
  • 查找该行的父对象
  • 计算该父代中该样式的行数
  • 计算有多少行的库存<0
  • 计算oos%
  • 返还率
  • 通过对每个广告组列应用功能来创建新列

这是我的意大利面条代码:

def calc_style_OOS(adgroups):
    for sku in adgroups:
        # find parent asin of ad group sku
        parentasin = allinv_styles.loc[(allinv_styles['sku'] == sku)]['(Parent) ASIN'].item()

        # I tried to print here to debug...
        print(parentasin)

        # find style of sku
        style = allinv_styles.loc[(allinv_styles['sku'] == sku)]['style'].item()

        # how many variations does this style have?
        total_variations = len(allinv_styles.loc[(allinv_styles['(Parent) ASIN'] == parentasin) &
                  (allinv_styles['style'] == style)])

        # how many of these rows have 0 stock?
        oos_variations = len(allinv_styles.loc[(allinv_styles['(Parent) ASIN'] == parentasin) &
                  (allinv_styles['style'] == style) &
                  (allinv_styles['afn-fulfillable-quantity'] < 0)])

        # caclulate oos %

        if total_variations == 0:
        return 0
        else: 
            oos = oos_variations/total_variations
            return oos

adgroups_df['OOS %'] = adgroups_df['Ad Group'].apply(calc_style_OOS)

深度错误消息:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-37-7ba9d94d5581> in <module>()
----> 1 adgroups_df['OOS %'] = adgroups_df['Ad Group'].apply(calc_style_OOS)

~\Anaconda3\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
   2549             else:
   2550                 values = self.asobject
-> 2551                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   2552 
   2553         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()

<ipython-input-36-ac54497ca2ef> in calc_style_OOS(adgroups)
     14     for sku in adgroups:
     15         # find parent asin of ad group sku
---> 16         parentasin = allinv_styles.loc[(allinv_styles['sku'] == sku)]['(Parent) ASIN'].item()
     17         # I tried to print here to debug...
     18         print(parentasin)

~\Anaconda3\lib\site-packages\pandas\core\base.py in item(self)
    717         """
    718         try:
--> 719             return self.values.item()
    720         except IndexError:
    721             # copy numpy's message here because Py26 raises an IndexError

ValueError: can only convert an array of size 1 to a Python scalar

2 个答案:

答案 0 :(得分:1)

如果我正确理解问题,请更改此内容:

def calc_style_OOS(adgroups):
    for sku in adgroups:

对此:

def calc_style_OOS(sku):

Series.apply正在逐个应用函数,您不需要calc_style_OOS中的循环。

如果要在allinv_styles中使用它,则需要将apply作为参数传递给calc_style_OOS

adgroups_df['OOS %'] = adgroups_df['Ad Group'].apply(calc_style_OOS, args=(allinv_styles,))

但是,我认为您应该为(Parent) ASINstyletotal_variationsoos_variations创建4个临时列,而不是在自定义apply中计算每个临时列功能。

示例(未经测试)

# Map (Parent) ASIN
adgroups_df['(Parent) ASIN'] = adgroups_df.sku.map(dict(zip(allinv_styles.sku, allinv_styles['(Parent) ASIN'])))

# Map style
adgroups_df['style'] = adgroups_df.sku.map(dict(zip(allinv_styles.sku, allinv_styles.style)))

# Get variation counts
group_cols = ['(Parent) ASIN', 'style']
total_variations = allinv_styles[group_cols].groupby(group_cols).size()
oos_variations = allinv_styles['afn-fulfillable-quantity'] < 0)][group_cols].groupby(group_cols).size()

# Calculate %, map back to adgroups_df
oos_percents = oos_variations / total_variations
oos_percents = oos_percents.where(oos_percents != np.inf, 0)
adgroups_df = adgroups_df.join(oos_percents, on=group_cols)

答案 1 :(得分:0)

def calc_style_OOS(adgroup):

# edge case ad group not in df

if len(allinv_styles[allinv_styles['sku'].isin([adgroup])]) == 0: 
    return 'No data'

else:

    # find parent asin of ad group sku
    parentasin = allinv_styles[['sku','(Parent) ASIN']].drop_duplicates().set_index('sku')['(Parent) ASIN'][adgroup]
    #print(parentasin)

    # find style of sku
    style = allinv_styles[['sku', 'style']].drop_duplicates().set_index('sku')['style'][adgroup]

    # how many variations does this style have?
    total_variations = len(allinv_styles.loc[(allinv_styles['(Parent) ASIN'] == parentasin) &
                                             (allinv_styles['style'] == style)])

    # how many of these rows have 0 stock?
    oos_variations = len(allinv_styles.loc[(allinv_styles['(Parent) ASIN'] == parentasin) &
                                           (allinv_styles['style'] == style) &
                                           (allinv_styles['afn-fulfillable-quantity'] < 1)])

    # caclulate oos %
    if total_variations == 0:
        return 0
    else: 
        return oos_variations/total_variations