我在Python上进行数据分析(例如使用本地二进制模式),我正在尝试优化我的代码。在我的代码中,我使用的是当前实现为numpu ndarray
向量的二进制向量。以下是我的代码中的三个函数:
# Will return a binary vector presentation of the neighbourhood
#
# INPUTS:
# 'ndata' numpy ndarray consisting of the neighbourhood X- and Y- coordinates and values
# 'thres' decimal value indicating the value of the center pixel
#
# OUTPUT:
# 'bvec' binary vector presentation of the neighbourhood
def toBinvec(ndata, thres):
bvec = np.zeros((len(ndata), 1))
for i in range(0, len(ndata)):
if ndata[i, 2]-thres < 0:
bvec[i] = 0
else:
bvec[i] = 1
return bvec
# Will check whether a given binary vector is uniform or not
# A binary pattern is uniform if when rotated one step, the number of
# bit values changing is <= 2
#
# INPUTS:
# 'binvec' is a binary vector of type numpy ndarray
#
# OUTPUT:
# 'True/False' boolean indicating uniformness
def isUniform(binvec):
temp = rotateDown(binvec) # This will rotate the binary vector one step down
devi = 0
for i in range(0, len(temp)):
if temp[i] != binvec[i]:
devi += 1
if devi > 2:
return False
else:
return True
# Will return the corresponding decimal number of binary vector
#
# INPUTS:
# 'binvec' is a binary vector of type numpy ndarray
#
# OUTPUT:
# 'value' The evaluated decimal value of the binary vector
def evaluate(binvec):
value = 0
for i in range(0, len(binvec)):
value += binvec[i]*(2**i)
return value
我是否应该采用其他方式实现二进制向量以使代码更高效?该代码将与大数据分析一起使用,因此效率是一个重要的问题。
我还需要对二进制向量进行一些操作,例如旋转它,评估其十进制值等。
感谢您提供任何帮助/提示! =)
答案 0 :(得分:1)
def toBinvec(ndata, thres):
return np.where(ndata[:,2] < thres, 0, 1 ).reshape(-1,1)
def isUniform(binvec):
temp = rotateDown(binvec) # This will rotate the binary vector one step down
if (np.count_nonzero(binvec!=temp)) > 2:
return False
else:
return True
def evaluate(binvec):
return sum(binvec * 2**np.arange(len(binvec)))
这应该会有所改善。但是大部分内容似乎都可以在高度优化版本的某些scipy(或相关)软件包中使用。
例如toBinvec
只是一个阈值,并且在许多包中都可用。