Question

我有一个功能定义 -

现在我必须创建一个这样的函数 -

问题是因为（t，c）（其中t是特征，c是类）有4种组合，它们可以出现（t，c），（t'，c），（t），c'），（t'，c'）。因此，根据t，c的值，函数定义也会发生变化。除了计算a，b，c，d 4次然后对函数值求和之外，还有什么方法吗？

数据集如下所示 -

feature file_frequency_M file_frequency_B
     abc          2                5

我的尝试 -

dataset = pd.read_csv('.csv')
score = []

###list =[(t,c) ,(t,c0),(t0,c),(t0,c0)]  ##representation of the combination of (t,c)
l=152+1394

for index, row in dataset.iterrows():
    a = row['file_frequency_M']
    b = row['file_frequency_B']
    c = 152 - a        
    d = 1394 - b
    temp_score = 0
    tmp1 = 0
    tmp2 = 0
    tmp3 = 0
    tmp4 = 0
    for i in range(4):
        if i == 0:
            if a == 0:
                tmp1 = 0
            else:
                tmp1 = log10(((a * l) / (a + c) * (a + b)))
        temp_score += tmp1
        if i == 1:
            if b == 0:
                tmp2 = 0
            else:
                tmp2 = log10(((b * l) / (b + d) * (b + a)))
        temp_score += tmp2    
        if i == 2:
            if c == 0:
                tmp3 = 0
            else:
                tmp3 = log10(((c * l) / (c + a) * (c + d)))
        temp_score += tmp3
        if i == 3:
            if d == 0:
                tmp4 = 0
            else:
                tmp4 = log10(((d * l) / (d + b) * (d + c)))
        temp_score += tmp4
    score.append(temp_score)
np.savetxt("m.csv", score, delimiter=",")

Answer 1

通过创建I(t,c)的函数表示，可以节省很多代码重复：

import numpy as np
import pandas as pd
from math import log10

dataset = pd.read_csv('.csv')
score = []

###list =[(t,c) ,(t,c0),(t0,c),(t0,c0)]  ##representation of the combination of (t,c)
l=152+1394

def I(a,b,c,n):
    """Returns I(t,c) = A*N/((A+C)*(A+B))"""
    if a == 0: 
        return 0
    return log10((a * n) / ((a + c) * (a + b)))

for index, row in dataset.iterrows():
    a = row['file_frequency_M']
    b = row['file_frequency_B']
    c = 152 - a        
    d = 1394 - b

    tmp1 = I(a,b,c,l)
    tmp2 = I(b,a,d,l)
    tmp3 = I(c,d,a,l)
    tmp4 = I(d,c,b,l)
    temp_score = sum(tmp1,tmp2,tmp3,tmp4)
    score.append(temp_score)

np.savetxt("m.csv", score, delimiter=",")

注意：根据您函数定义的图像，您的代码中似乎有一个错误，应该是：

log10((a * n) / ((a + c) * (a + b)))

不是

log10(((a * l) / (a + c) * (a + b)))

（请注意括号的位置）。

创建一般功能

1 个答案: