如何检查单词是否在同义词中?

时间:2017-07-22 11:02:36

标签: python list nltk words synonym

我正在尝试比较两个单词列表来检查:

  1. word1列表由也在word2列表的同义词中的单词组成

  2. word2 list包含同样位于word1 list

  3. 的synsets中的单词

    如果单词位于同义词集内,则会返回True

    这是我的代码:

    from nltk.corpus import wordnet as wn
    
    word1 =  ['study', 'car']
    word2 =  ['learn', 'motor']
    
    def getSynonyms(word1):
        synonymList1 = []
        for data1 in word1:
            wordnetSynset1 = wn.synsets(data1)
            tempList1=[]
            for synset1 in wordnetSynset1:
                synLemmas = synset1.lemma_names()
                for i in xrange(len(synLemmas)):
                    word = synLemmas[i].replace('_',' ')
                    if word not in tempList1:
                        tempList1.append(word)
            synonymList1.append(tempList1)
        return synonymList1
    
    
    def checkSynonyms(word1, word2):
        for i in xrange(len(word1)):
            for j in xrange(len(word2)):
                d1 = getSynonyms(word1)
                d2 = getSynonyms(word2)
                if word1[i] in d2:
                    return True
                elif word2[j] in d1:
                    return True
                else:
                    return False
    
    print word1
    print
    print word2
    print
    print getSynonyms(word1)
    print
    print getSynonyms(word2)
    print 
    print checkSynonyms(word1, word2)
    print
    

    但这是输出:

    ['study', 'car']
    
    ['learn', 'motor']
    
    [[u'survey', u'study', u'work', u'report', u'written report', u'discipline', 
    u'subject', u'subject area', u'subject field', u'field', u'field of study', 
    u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine', 
    u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the 
    books', u'meditate', u'contemplate'], [u'car', u'auto', u'automobile', 
    u'machine', u'motorcar', u'railcar', u'railway car', u'railroad car', 
    u'gondola', u'elevator car', u'cable car']]
    
    [[u'learn', u'larn', u'acquire', u'hear', u'get word', u'get wind', u'pick 
    up', u'find out', u'get a line', u'discover', u'see', u'memorize', 
    u'memorise', u'con', u'study', u'read', u'take', u'teach', u'instruct', 
    u'determine', u'check', u'ascertain', u'watch'], [u'motor', u'drive', 
    u'centrifugal', u'motive']]
    
    False
    

    我们可以看到,word1中的单词'study'也出现在word2>>的同义词中。 u'study'

    为什么它返回false?

1 个答案:

答案 0 :(得分:1)

由于您要将word1的字符串值与d2进行比较,请不要使用if word1[i] in d2:,因为它会将word1的字符串值与d2的数组值进行比较,例如它将进行比较:

'study' == [u'survey', u'study', u'work', u'report', u'written report', u'discipline', 
u'subject', u'subject area', u'subject field', u'field', u'field of study', 
u'bailiwick', u'sketch', u'cogitation', u'analyze', u'analyse', u'examine', 
u'canvass', u'canvas', u'consider', u'learn', u'read', u'take', u'hit the 
books', u'meditate', u'contemplate']

绝对会返回False

因此,不应使用if word1[i] in d2:,而应使用if word1[i] in d2[k]:,其中k是迭代器。

希望它会对你有所帮助。