我试图遍历数据以获取列的唯一值方面的数据。首先,我列出了目标列的唯一值,然后尝试定义一个函数,该函数将为特定的唯一值生成数据。但是,看来我不太成功。您能帮我解决这个问题吗?
uniq_list = df2['Sum of Qtd'].unique().tolist()
def unique_data(uniq):
unique_list=[]
for uniq in uniq_list:
if df2[df2['Sum of Qtd'] == uniq]:
res_df2 = []
res_df2 = pd.DataFrame(columns = df2.columns)
res_df2.append(uniq)
unique_list.append(res_df2)
for uniq in uniq_list:
print(unique_data(uniq)
但是我得到的错误如下,
> ValueError Traceback (most recent call
> last) <ipython-input-31-0455c2449f78> in <module>()
> 1 for uniq in uniq_list:
> ----> 2 print(unique_data(uniq))
>
> <ipython-input-29-75d579a4768f> in unique_data(uniq)
> 2 unique_list=[]
> 3 for uniq in uniq_list:
> ----> 4 if df2[df2['Sum of Qtd'] == uniq]:
> 5 res_df2 = []
> 6 res_df2 = pd.DataFrame(columns = df2.columns)
>
> ~\Anaconda3\lib\site-packages\pandas\core\generic.py in
> __nonzero__(self) 1571 raise ValueError("The truth value of a {0} is ambiguous. " 1572 "Use a.empty,
> a.bool(), a.item(), a.any() or a.all()."
> -> 1573 .format(self.__class__.__name__)) 1574 1575 __bool__ = __nonzero__
>
> ValueError: The truth value of a DataFrame is ambiguous. Use a.empty,
> a.bool(), a.item(), a.any() or a.all().
在进行了一些改进并解决了错误问题后,修改后的代码如下所示,但是我仍然无法基于“ Qtd的总和”列的唯一值创建数据子集。您能给我一些提示,我该怎么做吗?
def unique_data(uniq):
unique_list=[]
for i in uniq_list:
res_df2 = []
res_df2 = pd.DataFrame(columns = df2.columns)
if df2.loc[df2['Sum of Qtd'] != uniq].empty:
res_df2.append(df2)
Q1 = df2.Sales.quantile(0.25)
Q3 = df2.Sales.quantile(0.75)
IQR = Q3 - Q1
mask = (df2.Sales < (Q1 - 1.5 * IQR)) | (df2.Sales > (Q3 + 1.5 * IQR))
df2[mask] = np.nan
unique_list.append(res_df2)
print(unique_list)