我正在尝试在“数据框”列上应用函数以评估和分类行值。我为每种情况定义了该函数,并将此函数应用到该列上,但是出现两个错误。
我试图在循环外定义函数,采用三个参数而不是一个,在循环内定义函数,仅采用一个值,但是它们都具有相同的错误。
for i in list(df['segment'].unique()):
temp = df.query('segment== "%s"' %i)
for t in list(temp['area_tipe'].unique()):
temp2 = temp.query('area_tipe== "%s"' %t)
a = temp2.quantile(q=0.33)
b = temp2.quantile(q=0.66)
def classifierprice(x):
if float(x) < a:
rep = 'low'
elif float(x) > a:
if float(x) < b:
rep = 'medium'
elif float(x) > b:
rep = 'high'
return rep
temp2['price_class'] = temp2['price'].map(lambda x: classifierprice(x), axis=1)
TypeError: map() got an unexpected keyword argument 'axis'
使用Apply而不是map时,我遇到了相同的错误,如果我删除了轴,则同时应用和map时,我得到了以下代码/错误:
for i in list(df['segment'].unique()):
temp = df.query('segment== "%s"' %i)
for t in list(temp['area_tipe'].unique()):
temp2 = temp.query('area_tipe== "%s"' %t)
a = temp2.quantile(q=0.33)
b = temp2.quantile(q=0.66)
def classifierprice(x):
if float(x) < a:
rep = 'low'
elif float(x) > a:
if float(x) < b:
rep = 'medium'
elif float(x) > b:
rep = 'high'
return rep
temp2['price_class'] = temp2['price'].map(lambda x: classifierprice(x))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
有人知道如何解决吗?
我正在另一种分类中执行相同的映射方法,该方法不涉及拆分数据框及其正常工作,如下所示:
def grow(x):
if x > 0:
a = 'growing'
elif x < 0:
a = 'declining'
else:
a = 'constant'
return a
insights["text"] = (insights["score"].map(grow))
答案 0 :(得分:1)
您需要使用.quantile()
方法在此提取实际值,我们得到了一个包含1个值的序列对象,但是pandas并不理解它认为我们正在比较一个序列的单个值,因此错误,我们使用.values[0]
import pandas as pd
import numpy as np
### making some sample data
df = pd.DataFrame({"area_tipe":np.random.choice(["m","n","o"],100)
, "price" : np.random.randint(1,10,100)
, "segment":np.random.choice(["p","q","r"],100)})
### keeping the function ot of the for loop
def classifierprice(x, a, b):
x = float(x)
if x <= a:
rep = 'low'
elif a < x < b:
rep = 'medium'
elif x >= b:
rep = 'high'
return rep
for i in list(df['segment'].unique()):
temp = df.query('segment== "%s"' %i)
for t in list(temp['area_tipe'].unique()):
temp2 = temp.query('area_tipe== "%s"' %t)
a = temp2.quantile(q=0.33).values[0]
b = temp2.quantile(q=0.66).values[0]
temp2['price_class'] = temp2['price'].apply(lambda x: classifierprice(x,a,b))
输出:
您可以无循环地执行此操作,您将立即获得所有输出df! -尝试作为入门-
def grouped_classifierprice(df_filt):
a = df_filt.quantile(q=0.33).values[0]
b = df_filt.quantile(q=0.66).values[0]
return df_filt.price.apply(lambda x: classifierprice(x,a,b))
outdf = df.groupby(["area_tipe","segment"]).apply(grouped_classifierprice)