如何有效地在Python中表示二进制向量

时间:2014-02-13 11:45:29

标签: python optimization vector binary

我在Python上进行数据分析(例如使用本地二进制模式),我正在尝试优化我的代码。在我的代码中,我使用的是当前实现为numpu ndarray向量的二进制向量。以下是我的代码中的三个函数:

# Will return a binary vector presentation of the neighbourhood
#
# INPUTS:
# 'ndata' numpy ndarray consisting of the neighbourhood X- and Y- coordinates and values 
# 'thres' decimal value indicating the value of the center pixel
#
# OUTPUT:
# 'bvec' binary vector presentation of the neighbourhood 

def toBinvec(ndata, thres):

    bvec = np.zeros((len(ndata), 1)) 
    for i in range(0, len(ndata)):
        if ndata[i, 2]-thres < 0:
            bvec[i] = 0
        else:
            bvec[i] = 1
    return bvec 



# Will check whether a given binary vector is uniform or not 
# A binary pattern is uniform if when rotated one step, the number of
# bit values changing is <= 2
#
# INPUTS:
# 'binvec' is a binary vector of type numpy ndarray 
#
# OUTPUT:
# 'True/False' boolean indicating uniformness

def isUniform(binvec):

    temp = rotateDown(binvec) # This will rotate the binary vector one step down
    devi = 0
    for i in range(0, len(temp)):
        if temp[i] != binvec[i]:
            devi += 1
    if devi > 2:
        return False
    else:
        return True

# Will return the corresponding decimal number of binary vector
#
# INPUTS:
# 'binvec' is a binary vector of type numpy ndarray 
#
# OUTPUT:
# 'value' The evaluated decimal value of the binary vector 

def evaluate(binvec):

    value = 0
    for i in range(0, len(binvec)):
            value += binvec[i]*(2**i)
    return value

我是否应该采用其他方式实现二进制向量以使代码更高效?该代码将与大数据分析一起使用,因此效率是一个重要的问题。

我还需要对二进制向量进行一些操作,例如旋转它,评估其十进制值等。

感谢您提供任何帮助/提示! =)

1 个答案:

答案 0 :(得分:1)

def toBinvec(ndata, thres):
    return  np.where(ndata[:,2] < thres, 0, 1 ).reshape(-1,1)

def isUniform(binvec):

    temp = rotateDown(binvec) # This will rotate the binary vector one step down
    if (np.count_nonzero(binvec!=temp)) > 2:
        return False
    else:
        return True

def evaluate(binvec):
    return sum(binvec * 2**np.arange(len(binvec)))

这应该会有所改善。但是大部分内容似乎都可以在高度优化版本的某些scipy(或相关)软件包中使用。

例如toBinvec只是一个阈值,并且在许多包中都可用。