无法在python中调试maxmatch算法

时间:2017-03-10 02:50:03

标签: debugging

我一直致力于使用maxmatch算法来标记主题标签,与nltk中的单词列表进行比较,但我在调试方面遇到了麻烦。

算法的要点如下:

function MAXMATCH (sentence, dictionary D) returns word sequence W
  if sentence is empty
    return empty list
  for i ← length(sentence) downto 1
    firstword = first i chars of sentence
    remainder = rest of sentence
    if InDictionary(firstword, D)
      return list(firstword, MaxMatch(remainder,dictionary) )
# no word was found, so make a one-character word
  firstword = first char of sentence
  remainder = rest of sentence
  return list(firstword, MaxMatch(remainder,dictionary) )

以下是我的python实现。 我插入了一些print试图在这里和那里进行调试。

from nltk.corpus import words # words is a Python list
wordlist = set(words.words())

lst = []
def max_match(hashtag, wordlist):
    if not hashtag:
        return None
    for i in range(len(hashtag)-1, -1, -1):
        first_word = (hashtag[0:i+1])
        print "Firstword: " + first_word
        remainder = hashtag[i+1:len(hashtag)]
        print "Remainder: " + remainder
        if first_word in wordlist:
            print "Found: " + first_word
            lst.append(first_word)
            print lst
            max_match(remainder, wordlist)

# if no word is found, make one-character word
    first_word = hashtag[0]
    remainder = hashtag[1:len(hashtag)]
    lst.append(first_word)
    max_match(remainder, wordlist)
    return lst

print max_match('labourvictory', wordlist)

最后一行,print max_match('labourvictory', wordlist)应该返回列表['人工','胜利']我希望它因if not hashtag return None部分而退出,但由于理由我不理解它继续在所有的地狱都破裂了。

我在这里做错了什么?

1 个答案:

答案 0 :(得分:0)

在递归函数中,最常见的bug不是在正确的点返回值。我按照给定的伪代码要点修改了你的代码。您的代码中的问题是,当您在字典中找到单词时,您不会返回任何值。

def max_match(hashtag, wordlist):
    if not hashtag:
        return []
    for i in range(len(hashtag)-1, -1, -1):
        first_word = (hashtag[0:i+1])
        remainder = hashtag[i+1:len(hashtag)]
        if first_word in wordlist:
            return [first_word] + max_match(remainder, wordlist)

    # if no word is found, make one-character word
    first_word = hashtag[0]
    remainder = hashtag[1:len(hashtag)]

    return [first_word] + max_match(remainder, wordlist)