使用Python Decimal库加速算术运算

时间:2014-04-16 22:54:57

标签: python performance python-2.7 pagerank

我正在尝试运行类似于Google的PageRank算法的功能(当然,用于非商业用途)。这是Python代码;请注意a[0]是唯一重要的内容,a[0]包含n x n矩阵,例如[[0,1,1],[1,0,1],[1,1,0]]。此外,您可以在on Wikipedia找到我的代码:

def GetNodeRanks(a):        # graph, names, size
    numIterations = 10
    adjacencyMatrix = copy.deepcopy(a[0])
    b = [1]*len(adjacencyMatrix)
    tmp = [0]*len(adjacencyMatrix)
    for i in range(numIterations):
        for j in range(len(adjacencyMatrix)):
            tmp[j] = 0
            for k in range(len(adjacencyMatrix)):
                tmp[j] = tmp[j] + adjacencyMatrix[j][k] * b[k]
        norm_sq = 0
        for j in range(len(adjacencyMatrix)):
            norm_sq = norm_sq + tmp[j]*tmp[j]
        norm = math.sqrt(norm_sq)
        for j in range(len(b)):
            b[j] = tmp[j] / norm
    print b
    return b 

当我运行这个实现时(在比3 x 3矩阵(n.b.)大得多的矩阵上,它没有产生足够的精度来计算排名,这使得我可以有效地比较它们。所以我尝试了这个:

from decimal import *

getcontext().prec = 5

def GetNodeRanks(a):        # graph, names, size
    numIterations = 10
    adjacencyMatrix = copy.deepcopy(a[0])
    b = [Decimal(1)]*len(adjacencyMatrix)
    tmp = [Decimal(0)]*len(adjacencyMatrix)
    for i in range(numIterations):
        for j in range(len(adjacencyMatrix)):
            tmp[j] = Decimal(0)
            for k in range(len(adjacencyMatrix)):
                tmp[j] = Decimal(tmp[j] + adjacencyMatrix[j][k] * b[k])
        norm_sq = Decimal(0)
        for j in range(len(adjacencyMatrix)):
            norm_sq = Decimal(norm_sq + tmp[j]*tmp[j])
        norm = Decimal(norm_sq).sqrt
        for j in range(len(b)):
            b[j] = Decimal(tmp[j] / norm)
    print b
    return b 

即使在这种无用的低精度下,代码也非常慢,并且在我等待它运行的时候从未完成运行。以前,代码很快但不够精确。

是否有一种合理/简单的方法可以使代码同时快速准确地运行?

1 个答案:

答案 0 :(得分:1)

加速的一些提示:

  • 优化循环内的代码
  • 如果可能的话,将所有内容从内循环移开。
  • 不要重新计算,已知的,使用变量
  • 不做必要的事情,跳过它们
  • 考虑使用列表理解,它通常会快一点
  • 一旦达到可接受的速度就停止优化

遍历您的代码:

from decimal import *

getcontext().prec = 5

def GetNodeRanks(a):        # graph, names, size
    # opt: pass in directly a[0], you do not use the rest
    numIterations = 10
    adjacencyMatrix = copy.deepcopy(a[0])
    #opt: why copy.deepcopy? You do not modify adjacencyMatric
    b = [Decimal(1)]*len(adjacencyMatrix)
    # opt: You often call Decimal(1) and Decimal(0), it takes some time
    # do it only once like
    # dec_zero = Decimal(0)
    # dec_one = Decimal(1)
    # prepare also other, repeatedly used data structures
    # len_adjacencyMatrix = len(adjacencyMatrix)
    # adjacencyMatrix_range = range(len_ajdacencyMatrix)
    # Replace code with pre-calculated variables yourself

    tmp = [Decimal(0)]*len(adjacencyMatrix)
    for i in range(numIterations):
        for j in range(len(adjacencyMatrix)):
            tmp[j] = Decimal(0)
            for k in range(len(adjacencyMatrix)):
                tmp[j] = Decimal(tmp[j] + adjacencyMatrix[j][k] * b[k])
        norm_sq = Decimal(0)
        for j in range(len(adjacencyMatrix)):
            norm_sq = Decimal(norm_sq + tmp[j]*tmp[j])
        norm = Decimal(norm_sq).sqrt #is this correct? I woudl expect .sqrt()
        for j in range(len(b)):
            b[j] = Decimal(tmp[j] / norm)
    print b
    return b 

现在很少有关于如何在Python中优化列表处理的示例。

使用sum,更改:

        norm_sq = Decimal(0)
        for j in range(len(adjacencyMatrix)):
            norm_sq = Decimal(norm_sq + tmp[j]*tmp[j])

为:

        norm_sq = sum(val*val for val in tmp)

一点列表理解:

变化:

        for j in range(len(b)):
            b[j] = Decimal(tmp[j] / norm)

更改为:

    b = [Decimal(tmp_itm / norm) for tmp_itm in tmp]

如果你得到这种编码风格,你也可以优化初始循环,并且可能会发现一些预先计算的变量已经过时了。