如何使用Python的NLTK WordNet捕获大量(r)同义词和反义词?

时间:2014-09-12 16:40:44

标签: python nltk wordnet

我有一个词(例如' fight'),我想找到所有反义词和同义词的列表。

我使用这个词库条目进行比较。

http://www.thesaurus.com/browse/fight

目前我有以下代码,

from nltk import wordnet as wn
from itertools import chain

def flatten(l):
    return list( chain ( *l ) )

def get_variants(word):
    synonyms = []
    for syn in wn.synsets(word):
        synonyms.append( [syn.lemmas] )
        synonyms.append([ s.lemmas for s in syn.similar_tos() ] )
        synonyms.append([ s.lemmas for s in syn.hypernyms() ] )
        synonyms.append([ s.lemmas for s in syn.hyponyms() ] )

    return set( flatten( flatten( synonyms ) ) )

def get_synonyms(word):
    return sorted( [v.name for v in get_variants(word)] )

def get_antonyms(word):
    antonyms = flatten( [v.antonyms() for v in get_variants(word)] )
    return sorted( [a.name for a in antonyms] )

fight_synonyms = """action altercation argument battle bout brawl clash combat conflict confrontation
contest controversy disagreement dispute duel exchange feud match melee quarrel riot rivalry round
scuffle skirmish struggle war wrangling affray broil brush contention difficulty dissension dogfight
engagement fisticuffs fracas fray free-for-all fuss hostility joust row ruckus rumble scrap scrimmage
set-to strife tiff to-do tussle battle royal sparring match""".split()

print sorted(list(set(fight_synonyms) - set(get_synonyms('fight'))))

fight_antonyms = """accord agreement calm concord harmony peace quiet truce
friendliness friendship kindness""".split()

print sorted( list( set( fight_antonyms ) - set( get_antonyms('fight') ) ) )

错过

的同义词结果
['altercation', 'bout', 'broil', 'confrontation', 'contest', 'difficulty',
'disagreement', 'dispute', 'dissension', 'exchange', 'fracas', 'fuss',
'hostility', 'match', 'melee', 'quarrel', 'riot', 'rivalry', 'round',
'row', 'royal', 'ruckus', 'scrimmage', 'sparring', 'strife', 'tiff',
'to-do', 'wrangling']

错过了

的反义词结果
['accord', 'agreement', 'calm', 'concord', 'friendliness', 'friendship', 'harmony', 'kindness', 'peace', 'quiet', 'truce']

可以看出,我在战斗中遗漏了许多重要的同义词和反义词。

关于如何合理地扩大结果的任何建议 - 还有其他方法可以通过WordNet找到变种吗?

0 个答案:

没有答案