如何匹配第一个字符串与列和打印匹配?

时间:2017-07-24 11:51:58

标签: python pandas

我的数据框是

data = {
        'company_name' : ['auckland suppliers', 'Octagone', 'SodaBottel','Shimla Mirch'],
        'year' : [2000, 2001, 2003, 2004],
        'desc' : [' auckland has some good reviews','Octagone','we shall update you','we have varities of shimla mirch'],
}

df = pd.DataFrame(data)

我试过这段代码

df['CompanyMatch'] = df ['company_name'] == df ['desc']

我想打印"匹配"如果company_name列的第一个单词与desc列匹配。我很困惑,因为它放在index [0]的位置,以便它以这种方式打印:

> company_name         desc                                 CompanyMatch
> auckland suppliers   auckland has some good reviews       Match
> Octagone             Octagone                             Match
> SodaBottel           we shall update you                  NA
> Shimla Mirch         we have varities of shimla mirch     Match

1 个答案:

答案 0 :(得分:5)

您可以numpy.whereapply一起使用in检查另一列值,axis=1按行处理:

import numpy as np

m = df.apply(lambda x: x['company_name'].lower() in x['desc'].lower(), axis=1)
df['CompanyMatch'] = np.where(m, 'Match', np.nan)
print (df)
         company_name                              desc  year CompanyMatch
0  auckland suppliers    auckland has some good reviews  2000          nan
1            Octagone                          Octagone  2001        Match
2          SodaBottel               we shall update you  2003          nan
3        Shimla Mirch  we have varities of shimla mirch  2004        Match

编辑:

仅用于比较第一个单词:

m = df.apply(lambda x: x['company_name'].split()[0].lower() in x['desc'].lower(), axis=1)
df['CompanyMatch'] = np.where(m, 'Match', np.nan)
print (df)
         company_name                              desc  year CompanyMatch
0  auckland suppliers    auckland has some good reviews  2000        Match
1            Octagone                          Octagone  2001        Match
2          SodaBottel               we shall update you  2003          nan
3        Shimla Mirch  we have varities of shimla mirch  2004        Match