代码 这是一个hackerrank问题,提供的两个测试用例都失败了,请有人帮忙
from nltk.corpus import brown
from nltk.corpus import stopwords
def calculateCFD(cfdconditions, cfdevents):
# Write your code here
from nltk.corpus import brown
from nltk import ConditionalFreqDist
from nltk.corpus import stopwords
stopword = set(stopwords.words('english'))
cdev_cfd = [ (genre, word.lower()) for genre in cfdconditions for word in brown.words(categories=genre) if word.lower() not in stopword]
#cdev_cfd = [list(x) for x in cdev_cfd]
cdev_cfd = nltk.ConditionalFreqDist(cdev_cfd)
a = cdev_cfd.tabulate(condition = cfdconditions, samples = cfdevents)
inged_cfd = [ (genre, word.lower()) for genre in cfdconditions for word in brown.words(categories=genre) if (word.lower().endswith('ing') or word.lower().endswith('ed')) ]
inged_cfd = [list(x) for x in inged_cfd]
for wd in inged_cfd:
if wd[1].endswith('ing') and wd[1] not in stopword:
wd[1] = 'ing'
elif wd[1].endswith('ed') and wd[1] not in stopword:
wd[1] = 'ed'
inged_cfd = nltk.ConditionalFreqDist(inged_cfd)
b = inged_cfd.tabulate(cfdconditions, samples = ['ed','ing'])
return(a,b)
失败测试用例的输出是
many years
adventure 24 32
fiction 29 44
science_fiction 11 16
ed ing
fiction 2943 1767
adventure 3281 1844
science_fiction 574 293
和
good bad better
adventure 39 9 30
fiction 60 17 27
mystery 45 13 29
science_fiction 14 1 4
ed ing
adventure 3281 1844
fiction 2943 1767
science_fiction 574 293
mystery 2382 1374
请帮助我通过这些测试用例,因为我没有弄错地方
答案 0 :(得分:0)
删除以下两行
cdev_cfd = [ (genre, word.lower()) for genre in cfdconditions for word in brown.words(categories=genre) if word.lower() not in stopword]
cdev_cfd = nltk.ConditionalFreqDist(cdev_cfd)
并替换为
cdev_cfd = nltk.ConditionalFreqDist([ (genre, word.lower()) for genre in brown.categories() for word in brown.words(categories=genre) if word.lower() not in stopword and genre in cfdconditions])
答案 1 :(得分:0)
您的代码看起来不错。为了使这项工作有效,只需将条件一词更新为条件
a = cdev_cfd.tabulate(condition = cfdconditions, samples = cfdevents)
with
a = cdev_cfd.tabulate(conditions = cfdconditions, samples = cfdevents)
它将以彩色显示:)