Question

我有一个有效的代码，直到我添加了熵部分。现在它在打印行上给出了无效的语法错误。怎么样？

import nltk, math, re, numpy
from nltk import word_tokenize
from nltk.tokenize import RegexpTokenizer

def entropy(labels):
    freqdist = nltk.FreqDist(labels)
    probs = [freqdist.freq(1) for l in freqdist]
    return -sum(p * math.log(p,2) for p in probs)

def sents():
    fileObj = open('1865-Lincoln.txt', 'r')
    text = fileObj.read()
    tokens = nltk.sent_tokenize(text)
    for name in tokens:
        words = ' '.join(name.split()[:4])
        count = len(name.split())
        entro = entropy(len(name.split())
        print('{:<35} {:^15} {:>15}'.format(words, count, entro))

Answer 1

There is a closing bracket missing in the line above:

entro = entropy(len(name.split()))

尝试使用re将3个结果打印到表中

1 个答案: