使用fuzzywuzzy或其他软件包在python中对类似的三元组进行分组和替换

时间:2018-02-28 15:56:08

标签: python pandas similarity fuzzywuzzy

我有一套三元组让我们说

 ID |        Trigram         | Frequency 
  1 | great customer service |        10 
  2 | customer service great |         8 
  3 | good customer service |         6 
  4 | have some parking      |         5 
  5 | some more parking      |         2 

我希望在所有三元组中进行模糊匹配,并用频率最高的三元组替换类似的三元组。例如,上表应该成为

 ID |        Trigram         | Frequency 
  1 | great customer service |        10 
  2 | great customer service |         8 
  3 | great customer service |         6
  4 | have some parking      |         5 
  5 | have some parking      |         2 

我使用fuzzywuzzy包计算相似度,但无法弄清楚如何进行替换。提前致谢

0 个答案:

没有答案