Question

我的数据框df包含行标签Colour，Clothes和Week以及两列diff和diff_two。

MultiIndex(levels=[['Green', 'Yellow', 'Red', 'Blue', 'Black'], ['tshirt', 'jeans', 'pants', 'dress'], ['2017_46']],
           names=['Colour', 'Clothes', 'Week'],
           sortorder=0)

迭代行并比较包含字符串的diff和diff_two的最简单方法是什么？

我的想法是在数据框上使用循环

for colour in colour_list:
    for clothes in clothes_list:
        if colour in df.index.level[0] and clothes in df.index.level[1]:
            if df.loc[colour, clothes]['diff'] = df.loc[colour, clothes]['diff2']: do something

这是错误的，因为if条件总是为真，即它不会将索引看作元组，即（颜色，衣服）。

将两列与多指数进行比较的最佳方法是什么？

谢谢！

使用示例更新问题：

Colour    Clothes    Week    diff     diff1
Green     Jeans      50      Mango    Zara
Yellow    Shirt      50      Zara     Zara   
Blue      Shirt      50      Prada    nan
Green     Jeans      50      Zara     Zara
Green     Jeans      50      nan      nan

使用所需的输出进行更新：

Colour    Clothes    Week    diff     diff1    output
Green     Jeans      50      Mango    Zara     Mango --> Zara
Yellow    Shirt      50      Zara     Zara     No difference
Blue      Shirt      50      Prada    nan      Prada --> nan    
Green     Jeans      50      Zara     Zara     No difference
Green     Jeans      50      nan      nan      nan --> nan

在Pandas中比较基于多索引的列

0 个答案: