如何在特定表中的特定列中查找公共值并显示相交的输出

时间:2015-10-20 07:07:37

标签: python python-2.7 pandas

Roll  Class  Country  Rights  CountryAcc
1     x      IND      23      US
1     x1     IND      32      Ind
2     s      US       12      US
3     q      IRL      33      CA
4     a      PAK      12      PAK
4     e      PAK      12      IND
5     f      US       21      CA
5     g      US       31      PAK
6     h      US       21      BAN

我只想显示那些Rolls不在CountryAccUS的{​​{1}}。例如:如果CA Roll1中有一个CountryAcc,那么我不希望其他行包含US CountryAcc,同样如此与Ind Roll一样,5CountryAcc的行为CA。所以我的最终输出是:

Roll  Class  Country  Rights  CountryAcc
4     a      PAK      12      PAK
4     e      PAK      12      IND
6     h      US       21      BAN

我尝试按照以下方式获取输出:

Home_Country = ['US', 'CA']

#First I saved two countries in a variable
Account_Other_Count = df.loc[~df.CountryAcc.isin(Home_Country)]
Account_Other_Count_Var = df.loc[~df.CountryAcc.isin(Home_Country)][['Roll']].values.ravel()

# Then I made two variables one with CountryAcc in US or CA and other variable with remaining and I got their Roll
Account_Home_Count = df.loc[df.CountryAcc.isin(Home_Country)]
Account_Home_Count_Var = df.loc[df.CountryAcc.isin(Home_Country)][['Roll']].values.ravel()

#Here I got the common Rolls
Common_ROLL = list(set(Account_Home_Count_Var).intersection(list(Account_Other_Count_Var)))
Final_Output = Account_Other_Count.loc[~Account_Other_Count.Roll.isin(Common_ROLL)]

有没有更好的,更多的熊猫或pythonic方式来做它。

2 个答案:

答案 0 :(得分:0)

一种解决方案可能是

In [37]: df.ix[~df['Roll'].isin(df.ix[df['CountryAcc'].isin(['US', 'CA']), 'Roll'])]
Out[37]:
   Roll Class Country  Rights CountryAcc
4     4     a     PAK      12        PAK
5     4     e     PAK      12        IND
8     6     h      US      21        BAN

答案 1 :(得分:0)

这是一种方法:

sortdata = df[~df['CountryAcc'].isin(['US', 'CA'])].sort(axis=0)