BLEU在Python中的分数实现

时间:2018-02-27 05:13:55

标签: python

class BLEU(object):
    def compute(candidate, references, weights):
        candidate = [c.lower() for c in candidate]
        references = [[r.lower() for r in reference] for reference in references]

        p_ns = (BLEU.modified_precision(candidate, references, i) for i, _ in enumerate(weights, start=1))
        s = math.fsum(w * math.log(p_n) for w, p_n in zip(weights, p_ns) if p_n)

        bp = BLEU.brevity_penalty(candidate, references)
        return bp * math.exp(s)

    def modified_precision(candidate, references, n):

        counts = Counter(ngrams(candidate, n))

        if not counts:
            return 0

        max_counts = {}
        for reference in references:
            reference_counts = Counter(ngrams(reference, n))
            for ngram in counts:
                max_counts[ngram] = max(max_counts.get(ngram, 0), reference_counts[ngram])

        clipped_counts = dict((ngram, min(count, max_counts[ngram])) for ngram, count in counts.items())

        return sum(clipped_counts.values()) / sum(counts.values())

    def brevity_penalty(candidate, references):
        c = len(candidate)
        r = min(abs(len(r) - c) for r in references)

        if c > r:
            return 1
        else:
            return math.exp(1 - r / c)

我想解压缩nltk.bleu_score库,以便我可以轻松地将其导入到Android_app开发中。

如果我跑到下面,

bleu = BLEU()    
bleu.compute(candidate, reference_wik)

它会返回如下错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-243daadf668f> in <module>()
----> 1 bleu.compute(candidate, reference_wik)

<ipython-input-11-1d4045ea108d> in compute(candidate, references, weights)
      1 class BLEU(object):
      2     def compute(candidate, references, weights):
----> 3         candidate = [c.lower() for c in candidate]
      4         references = [[r.lower() for r in reference] for reference in references]
      5 

TypeError: 'BLEU' object is not iterable

,其中

candidate = ['consists', 'of', 'to', 'make', 'a', 'whole']

reference_wik = [['To', 'be', 'made', 'up', 'of;', 'to', 'consist', 'of', '(especially', 'a', 'comprehensive', 'list', 'of', 'parts).', '[from', 'earlier', '15thc]'], ['To', 'contain', 'or', 'embrace.', '[from', 'earlier', '15thc]'], ['proscribed,', 'usually', 'in', 'the', 'passive)', 'To', 'compose,', 'to', 'constitute.', 'See', 'usage', 'note', 'below'], ['law)', 'To', 'include,', 'contain,', 'or', 'be', 'made', 'up', 'of,', 'defining', 'the', 'minimum', 'elements,', 'whether', 'essential', 'or', 'inessential,', 'to', 'define', 'an', 'invention.', '("Open-ended",', "doesn't", 'limit', 'to', 'the', 'items', 'listed;', 'cf.', 'compose,', 'which', 'is', '"closed"', 'and', 'limits', 'to', 'the', 'items', 'listed)']]

我无法跟踪给定error_message导致问题的原因。 有没有要调试的提示?

1 个答案:

答案 0 :(得分:0)

任何类函数中的第一个参数是对象实例变量,并且应该命名为<div> cccccccccccccccccccccccccccccccccccccccccccccccc </div>,在Java中可以将其视为self。来自官方docs

  

通常,方法的第一个参数称为self。这不重要   不仅仅是一个惯例:名称self绝对没有特别之处   意思是Python。

从此SO answer

  

方法的第一个参数是调用方法的实例   上。这使得方法与函数完全相同,并且离开了   实际使用的名称(尽管自我是惯例,而且   当你使用别的东西时,人们通常会皱着眉头。)

所以将代码更改为:

this