Question

这是我的datafarme'df'：

match           name                   group  
adamant         Adamant Home Network   86   
adamant         ADAMANT, Ltd.          86   
adamant bild    TOV Adamant-Bild       86   
360works        360WORKS               94   
360works        360works.com           94

每个组号我想逐个比较名称，看看它们是否与“匹配”列中的同一个词匹配。

所以期望的输出将是计数：

 If they match we count it as 'TP' and if not we count it as 'FN'.

我想知道每个组号的匹配单词数量，但这对我想要的东西没有帮助：

df.groupby(group).count()

有没有人知道怎么做？

Answer 1

对于每个提供的组，此函数将逐行比较名称和匹配：

def apply_func(df):
    x = df['name'] == df['match']
    return x.map({False:'FIN', True:'TP'})

In [683]: temp.join(temp.groupby('group').apply(apply_func).reset_index(), rsuffix='_1', how='left')
Out[683]: 
           match                  name  group  group_1  level_1    0
0        adamant  Adamant Home Network     86       86        0  FIN
1        adamant         ADAMANT, Ltd.     86       86        1  FIN
2  adamant bild       TOV Adamant-Bild     86       86        2  FIN
3       360works              360WORKS     94       94        3  FIN
4       360works          360works.com     94       94        4  FIN

Python Pandas：如何分组和比较列

1 个答案: