Question

在代码中有一个名为clean_up的辅助函数，下面是我的代码。我想知道我需要修复，添加或移除什么才能使其正常工作。

def clean_up(s):
    """ (str) -> str

    Return a new string based on s in which all letters have been
    converted to lowercase and punctuation characters have been stripped 
    from both ends. Inner punctuation is left untouched. 

    >>> clean_up('Happy Birthday!!!')
    'happy birthday'
    >>> clean_up("-> It's on your left-hand side.")
    " it's on your left-hand side"
    """

    punctuation = """!"',;:.-?)([]<>*#\n\t\r"""
    result = s.lower().strip(punctuation)
    return result


##########  Complete the following functions. ############

def type_token_ratio(text):
    """ (list of str) -> float

    Precondition: text is non-empty. Each str in text ends with \n and
    text contains at least one word.

    Return the Type Token Ratio (TTR) for this text. TTR is the number of
    different words divided by the total number of words.

    >>> text = ['James Fennimore Cooper\n', 'Peter, Paul, and Mary\n',
        'James Gosling\n']
    >>> type_token_ratio(text)
    0.8888888888888888
    """ 

    # To do: Fill in this function's body to meet its specification.

    distinctwords = dict()
    words = 0
    for line in text.splitlines():
        line = line.strip().split()
        for word in line:
            words+=1
            if word in distinctwords:
                distinctwords[word]+=1
            else:
                distinctwords[word]=1
    TTR= len(distinctwords)/words
    return TTR

Answer 1

您的代码甚至无法运行，for line in text.splitlines()尝试拆分列表，您需要迭代传入的名为text的单词列表，使用collections.defaultdict也将效率更高：

 def type_token_ratio(text):
    from collections import defaultdict
    distinctwords = defaultdict(int)
    for words in text: # get each string
        words = clean_up(words) # clean the string 
        for word in words.split(): # split into individual words
            distinctwords[word] += 1 # increase the count for each word
    TTR = len(distinctwords) / sum(distinctwords.values()) #  sum(distinctwords.values()) will give total amount of words
    return TTR

代码运行但不符合类型合同中的前提条件

1 个答案: