根据字典值过滤和迭代 Pandas 数据框

时间:2021-05-07 10:33:36

标签: python-3.x pandas dataframe data-cleaning

我正在尝试遍历 Pandas 数据框以创建过滤条件,下面是我的代码,它工作正常:-

categories_lst = [
        ["BEER/ALE/ALCOHOLIC CIDER"],
        ["CIGARETTES", "CIGARS", "ELECTRONIC SMOKING DEVICES"],
        ["COLD CEREAL"],
        ["YOGURT"],
    ]
    threshold_lst = [1, 0.25, 0.25, 0.25]
    i = 0
    for lst in categories_lst:
        # filtering category
        df_report = df_us_brand_report[df_us_brand_report["category"].isin(lst)]
        df_report = df_report[abs(df_report["change"]) >= threshold_lst[i]]
        print(lst)
        print(threshold_lst[i])
        i += 1
       # some other operations

但是,我想优化它,我尝试使用下面的代码,但由于列表不可哈希而失败。

category_dict = {
        ["BEER/ALE/ALCOHOLIC CIDER"]: 1,
        ["CIGARETTES", "CIGARS", "ELECTRONIC SMOKING DEVICES"]: 0.25,
        ["COLD CEREAL"]: 0.25,
        ["YOGURT"]: 0.25,
    }
    for condition, value in category_dict:
        filter_condition = (df_us_brand_report["category"].isin(condition)) & (
            abs(df_us_brand_report["change"]) >= value
        )

        # some other operations

任何帮助将不胜感激。

数据样本: enter image description here

2 个答案:

答案 0 :(得分:1)

使用元组:

category_dict = {
        ("BEER/ALE/ALCOHOLIC CIDER", ): 1,
        ("CIGARETTES", "CIGARS", "ELECTRONIC SMOKING DEVICES"): 0.25,
        ("COLD CEREAL", ): 0.25,
        ("YOGURT", ): 0.25,
    }

for condition, value in category_dict.items():
    filter_condition = (df_us_brand_report["category"].isin(list(condition))) & (
            abs(df_us_brand_report["change"]) >= value)

答案 1 :(得分:0)

对于字典键,您必须使用不可变对象,例如元组、字符串或 int 等。