计算精度,召回率和fscore,但是召回值有问题吗?

时间:2020-06-27 02:29:15

标签: python python-3.x twitter

我在第20行中找到 setGt 变量的长度时遇到问题。 问题是我有一个嵌套列表,如:train = [“ ['eclipse','eclipse']”,“ ['k','k']”,“ ['pmqs']”,“ ['愚蠢的事故']“,” ['addictive tv shows']“],然后我尝试查找每个嵌套列表的长度,但是我的代码为第一个内部列表返回了 2 length 尽管最后一个内部列表具有相同的元素和 3个长度,而第二个是正确的,但我如何解决第一个列表的问题,因为所有具有相同的元素。下面是代码。

def calPrecision(golds, segments):
    pr = 0
    recl = 0
    fsc = 0
    for gt, seg in zip(golds, segments):
        gt = {gt}
        seg = {seg}
        p = []
        r = []
        fscore = []
        for sg in seg:
            print('sg: ', sg)
            for g in gt:
                print('g: ', g)
                if sg in g:
                    correct = sg.split()
                    lseg = len(sg.split())
                    setGt = set(g.split())
                    print('setGt: ', setGt)
                    lg = len(setGt)
                    print('lenseGt: ', lg)
                    print(correct, 'lenCorrect: ', len(correct), lseg, 'leng:', lg)
                    
                    pre = (len(correct) * 1.0) / lseg
                    rec = (len(correct) * 1.0) / lg 
                    print('pre: ', pre, '|', 'recall:', rec)
                    p.append(pre)
                    r.append(rec)
                    sum_p_r = pre + rec
                    f1 = 0 if sum_p_r == 0 else (2*pre*rec)/(sum_p_r) 
                    print('f1: ', f1)
                    fscore.append(f1)
                    
                    print('---------------------------------------------------------')
                    pr += max(p)
                    recl += max(r)
                    fsc += max(fscore)

    return (pr*1.0) / len(segments), (recl*1.0) / len(segments), (fsc*1.0) / len(segments) 


train = ["['eclipse', 'eclipse']", "['k', 'k']", "['pmqs']", "['stupid accident']", "['addictive tv shows']"]

enE = ['eclipse', 'k', 'pmqs', 'stupid accident', 'addictive tv shows']

calPrecision(train, enE)

0 个答案:

没有答案