Python - 查找给定另一列中的匹配字符串的列的平均值

时间:2018-02-18 11:40:20

标签: python python-3.x pandas dataframe

我试图计算数据框中包含来自wordlist的字词的产品数量,然后查找这些产品的平均价格。以下尝试 -

for word in wordlist:
    total_count += dframe.Product.str.contains(word, case=False).sum()
    total_price += dframe[dframe['Product'].str.contains(word)]['Price']
    print(dframe[dframe['Product'].str.contains(word)]['Price'])
average_price = total_price / total_count

average_price作为Series([], Name: Price, dtype: float64)返回,而不是预期的浮点值。

我做错了什么?

谢谢!

2 个答案:

答案 0 :(得分:2)

标量值的每个条件都需要sumPrice

total_count, total_price = 0, 0
for word in wordlist:
    total_count += dframe.Product.str.contains(word, case=False).sum()
    total_price += dframe.loc[dframe['Product'].str.contains(word), 'Price'].sum()
average_price = total_price / total_count

或chache mask变量以获得更好的可读性和性能:

total_count, total_price = 0, 0
for word in wordlist:
    mask = dframe.Product.str.contains(word, case=False)
    total_count += mask.sum()
    total_price += dframe.loc[mask, 'Price'].sum()

average_price = total_price / total_count

解决方案应该使用正则表达式进行简化word1|word2|word3 - |表示or

mask = dframe.Product.str.contains('|'.join(wordlist), case=False)
total_count = mask.sum()
total_price = dframe.loc[mask, 'Price'].sum()

average_price = total_price / total_count
mask = dframe.Product.str.contains('|'.join(wordlist), case=False)
average_price = dframe.loc[mask, 'Price'].mean()

<强>示例

dframe = pd.DataFrame({
    'Product': ['a1','a2','a3','c1','c1','b','b2','c3','d2'],
    'Price': [1,3,5,6,3,2,3,5,2]
})
print (dframe)
   Price Product
0      1      a1
1      3      a2
2      5      a3
3      6      c1
4      3      c1
5      2       b
6      3      b2
7      5      c3
8      2      d2

wordlist = ['b','c']
mask = dframe.Product.str.contains('|'.join(wordlist), case=False)
average_price = dframe.loc[mask, 'Price'].mean()
print (average_price)
3.8

答案 1 :(得分:1)

您可以使用值函数以避免系列。

total_count + = dframe.Product.str.contains(word,case = False).value.sum()

total_price + = dframe [dframe ['Product']。str.contains(word)] ['Price']。value