使用WordNet synset可以快速确定最相关的同义词吗?以下是我的尝试: ...为某些人工作但不为其他人工作。如果第一个synset仅包含原始单词,那么您不能获得多个,例如在下面使用friend的示例中。
>>> [synset.lemma_names for synset in wn.synsets('car')][0][:2]
['car', 'auto']
>>>[synset.lemma_names for synset in wn.synsets('friend')][0]
['friend']
朋友的实际sysets:
>>>[synset.lemma_names for synset in wn.synsets('friend')]
[['friend'], ['ally', 'friend'], ['acquaintance', 'friend'], ['supporter', 'protagonist', 'champion', 'admirer', 'booster', 'friend'], ['Friend', 'Quaker']]
有什么想法吗?
答案 0 :(得分:0)
Wordnet地图并将感官与感官联系起来。因此,在WordNet中可能无法查找相关的同义词。你也陷入了多重感官的问题;没有上下文,任何单词的意义都不清楚:
>>> wn.synsets('car')
[Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]
>>> wn.synsets('auto')
[Synset('car.n.01')]
>>> wn.synsets('friend')
[Synset('friend.n.01'), Synset('ally.n.02'), Synset('acquaintance.n.03'), Synset('supporter.n.01'), Synset('friend.n.05')]
当你研究每个lemma_name包含的不同含义时,你会得到一个同义词词汇的同义词的递归链......:
from itertools import chain
>>> set(chain(*[j.lemma_names() for j in wn.synsets('friend')]))set([u'acquaintance', u'champion', u'Quaker', u'protagonist', u'friend', u'admirer', u'Friend', u'supporter', u'ally', u'booster'])
>>> [wn.synsets(i) for i in set(chain(*[j.lemma_names() for j in wn.synsets('friend')]))]
[[Synset('acquaintance.n.01'), Synset('acquaintance.n.02'), Synset('acquaintance.n.03')], [Synset('champion.n.01'), Synset('champion.n.02'), Synset('supporter.n.01'), Synset('ace.n.03'), Synset('champion.v.01'), Synset('champion.s.01')], [Synset('friend.n.05'), Synset('quaker.n.02')], [Synset('supporter.n.01'), Synset('protagonist.n.02')], [Synset('friend.n.01'), Synset('ally.n.02'), Synset('acquaintance.n.03'), Synset('supporter.n.01'), Synset('friend.n.05')], [Synset('supporter.n.01'), Synset('admirer.n.02'), Synset('admirer.n.03')], [Synset('friend.n.01'), Synset('ally.n.02'), Synset('acquaintance.n.03'), Synset('supporter.n.01'), Synset('friend.n.05')], [Synset('supporter.n.01'), Synset('patron.n.03'), Synset('assistant.n.01'), Synset('garter.n.01'), Synset('athletic_supporter.n.01')], [Synset('ally.n.01'), Synset('ally.n.02'), Synset('ally.v.01')], [Synset('supporter.n.01'), Synset('promoter.n.01'), Synset('booster.n.03'), Synset('booster.n.04'), Synset('booster.n.05'), Synset('booster.n.06')]]
>>> [k.lemma_names() for k in chain(*[wn.synsets(i) for i in set(chain(*[j.lemma_names() for j in wn.synsets('friend')]))])]
[[u'acquaintance', u'familiarity', u'conversance', u'conversancy'], [u'acquaintance', u'acquaintanceship'], [u'acquaintance', u'friend'], [u'champion', u'champ', u'title-holder'], [u'champion', u'fighter', u'hero', u'paladin'], [u'supporter', u'protagonist', u'champion', u'admirer', u'booster', u'friend'], [u'ace', u'adept', u'champion', u'sensation', u'maven', u'mavin', u'virtuoso', u'genius', u'hotshot', u'star', u'superstar', u'whiz', u'whizz', u'wizard', u'wiz'], [u'champion', u'defend'], [u'champion', u'prizewinning'], [u'Friend', u'Quaker'], [u'quaker', u'trembler'], [u'supporter', u'protagonist', u'champion', u'admirer', u'booster', u'friend'], [u'protagonist', u'agonist'], [u'friend'], [u'ally', u'friend'], [u'acquaintance', u'friend'], [u'supporter', u'protagonist', u'champion', u'admirer', u'booster', u'friend'], [u'Friend', u'Quaker'], [u'supporter', u'protagonist', u'champion', u'admirer', u'booster', u'friend'], [u'admirer'], [u'admirer', u'adorer'], [u'friend'], [u'ally', u'friend'], [u'acquaintance', u'friend'], [u'supporter', u'protagonist', u'champion', u'admirer', u'booster', u'friend'], [u'Friend', u'Quaker'], [u'supporter', u'protagonist', u'champion', u'admirer', u'booster', u'friend'], [u'patron', u'sponsor', u'supporter'], [u'assistant', u'helper', u'help', u'supporter'], [u'garter', u'supporter'], [u'athletic_supporter', u'supporter', u'suspensor', u'jockstrap', u'jock'], [u'ally'], [u'ally', u'friend'], [u'ally'], [u'supporter', u'protagonist', u'champion', u'admirer', u'booster', u'friend'], [u'promoter', u'booster', u'plugger'], [u'booster', u'shoplifter', u'lifter'], [u'booster', u'booster_amplifier', u'booster_station', u'relay_link', u'relay_station', u'relay_transmitter'], [u'booster', u'booster_rocket', u'booster_unit', u'takeoff_booster', u'takeoff_rocket'], [u'booster', u'booster_dose', u'booster_shot', u'recall_dose']]
简而言之,在WordNet中寻找相关的同义词可能是不可能的,很可能你不得不寻找其他资源。