创建一般功能

时间:2018-06-19 09:03:45

标签: python numpy

我有一个功能定义 -

enter image description here

现在我必须创建一个这样的函数 -

enter image description here

问题是因为(t,c)(其中t是特征,c是类)有4种组合,它们可以出现(t,c),(t',c),(t) ,c'),(t',c')。因此,根据t,c的值,函数定义也会发生变化。 除了计算a,b,c,d 4次然后对函数值求和之外,还有什么方法吗?

数据集如下所示 -

feature file_frequency_M file_frequency_B
     abc          2                5  

我的尝试 -

dataset = pd.read_csv('.csv')
score = []

###list =[(t,c) ,(t,c0),(t0,c),(t0,c0)]  ##representation of the combination of (t,c)
l=152+1394

for index, row in dataset.iterrows():
    a = row['file_frequency_M']
    b = row['file_frequency_B']
    c = 152 - a        
    d = 1394 - b
    temp_score = 0
    tmp1 = 0
    tmp2 = 0
    tmp3 = 0
    tmp4 = 0
    for i in range(4):
        if i == 0:
            if a == 0:
                tmp1 = 0
            else:
                tmp1 = log10(((a * l) / (a + c) * (a + b)))
        temp_score += tmp1
        if i == 1:
            if b == 0:
                tmp2 = 0
            else:
                tmp2 = log10(((b * l) / (b + d) * (b + a)))
        temp_score += tmp2    
        if i == 2:
            if c == 0:
                tmp3 = 0
            else:
                tmp3 = log10(((c * l) / (c + a) * (c + d)))
        temp_score += tmp3
        if i == 3:
            if d == 0:
                tmp4 = 0
            else:
                tmp4 = log10(((d * l) / (d + b) * (d + c)))
        temp_score += tmp4
    score.append(temp_score)
np.savetxt("m.csv", score, delimiter=",")     

1 个答案:

答案 0 :(得分:2)

通过创建I(t,c)的函数表示,可以节省很多代码重复:

import numpy as np
import pandas as pd
from math import log10

dataset = pd.read_csv('.csv')
score = []

###list =[(t,c) ,(t,c0),(t0,c),(t0,c0)]  ##representation of the combination of (t,c)
l=152+1394

def I(a,b,c,n):
    """Returns I(t,c) = A*N/((A+C)*(A+B))"""
    if a == 0: 
        return 0
    return log10((a * n) / ((a + c) * (a + b)))

for index, row in dataset.iterrows():
    a = row['file_frequency_M']
    b = row['file_frequency_B']
    c = 152 - a        
    d = 1394 - b

    tmp1 = I(a,b,c,l)
    tmp2 = I(b,a,d,l)
    tmp3 = I(c,d,a,l)
    tmp4 = I(d,c,b,l)
    temp_score = sum(tmp1,tmp2,tmp3,tmp4)
    score.append(temp_score)

np.savetxt("m.csv", score, delimiter=",")     

注意:根据您函数定义的图像,您的代码中似乎有一个错误,应该是:

log10((a * n) / ((a + c) * (a + b)))

不是

log10(((a * l) / (a + c) * (a + b)))

(请注意括号的位置)。