Question

I'm trying to implement Smith-Waterman alignment in parallel using Julia (see: Figure 1 of http://www.cs.virginia.edu/~rl6sf/paper_dump/2011:12:33:22.pdf), but the algorithm is running much slower in Julia than the serial version. I'm using shared arrays to do this and figure I am doing something silly that is making the code run slow. Could someone take a look and see if my code is optimized as possible? The parallel version should run faster than in serial….

The basic concept of it is to compute the anti-diagonal elements of a matrix in parallel from the upper left to lower right corner and to update them. I'm trying to use 32 cores on a shared array machine to do this. I have a SharedArray matrix that I am using to do this and am computing the elements of each anti-diagonal in parallel as shown below. The while loops in the spSW function submit tasks to workers in sync for each anti-diagonal using the helper function shared_get_score(). The main goal of this function is to fill in each element in the shared arrays "matrix" and "path".

source $CALDB/software/tools/caldbinit.sh

The other helper functions are:

function spSW(seq1,seq2,p)
    indel = -1
    match = 2

    seq1 = "^$seq1"
    seq2 = "^$seq2"

    col = length(seq1)
    row = length(seq2)

    wl = workers()

    matrix,path = shared_initialize_path(seq1,seq2)

    for j = 2:col
        jcol = j
        irow = 2
        @sync begin
            count = 0
            while jcol > 1 && irow < row + 1
                #println(j," ",irow," ",jcol)
                if seq1[jcol] == seq2[irow]
                    equal = true
                else
                    equal = false
                end
                w = wl[(count % p) + 1]
                @async remotecall_wait(w,shared_get_score!,matrix,path,equal,indel,match,irow,jcol)
                jcol -= 1
                irow += 1
                count += 1
            end
        end
    end

    for i = 3:row
        jcol = col
        irow = i
        @sync begin
            count = 0
            while irow < row+1 && jcol > 1
                #println(j," ",irow," ",jcol)
                if seq1[jcol] == seq2[irow]
                    equal = true
                else
                    equal = false
                end
                w = wl[(count % p) + 1]
                @async remotecall_wait(w,shared_get_score!,matrix,path,equal,indel,match,irow,jcol)
                jcol -= 1
                irow += 1
                count += 1
            end
        end
    end
    return matrix,path
end

Does anyone see an obvious way to make this run faster? Right now it's about 10 times slower than the serial version.

Why is my Julia shared array code running so slow?

0 个答案: