C ++中的Cache Oblivious Matrix Transposition实现

时间:2017-11-29 13:30:12

标签: c++ matrix transpose

我在C ++中实现了一个就地缓存不经意的矩阵转置算法,如下所示:

void CacheObliviousTransposition(int x, int delx, int y, int dely, int N, int* matrix) {
    if ((delx == 1) && (dely == 1)) {
        int tmp = matrix[(N*y) + x];
        matrix[(N*y) + x] = matrix[(N*x) + y];
        matrix[(N*x) + y] = tmp;
        return;
    }

    if (delx >= dely) {
        int xmid = delx / 2;
        CacheObliviousTransposition(x, xmid, y, dely, N, matrix);
        CacheObliviousTransposition(x + xmid, delx - xmid, y, dely, N, matrix);
        return;
    }

    int ymid = dely / 2;
    CacheObliviousTransposition(x, delx, y, ymid, N, matrix);
    CacheObliviousTransposition(x, delx, y + ymid, dely - ymid, N, matrix);
}

然而,当我在换位后调用下面的方法以确保它正常工作时,正在输入if循环,所以我假设某些实现必定是错误的。

void CheckTransposition(int N, int* matrix)
{
    for (int i = 0; i < N; i++)
    {
        for (int j = 0; j < N; j++)
        {
            if (matrix[(i*N) + j] != (j*N) + i + 42)
            {
                cout << "Transposition failed at i=" << i << ", j=" << j << "\n";
            } 
        }
    }
}

有人能帮我辨别出什么问题吗?

注意:变量矩阵是动态分配的整数数组,如下所示,因为矩阵在N * N个连续的内存位置中逐行存储

int* MatrixInit(int N)
{

    int* matrix = new int[N*N];

    for (int i = 0; i < N; i++) {
        for (int j = 0; j < N; j++) {
            matrix[(i*N) + j] = (i*N) + j + 42;
        }
    }

    return matrix;
}

1 个答案:

答案 0 :(得分:2)

上面的代码会将您的元素转置两次。例如,一旦CacheObliviousTransposition到达单个元素[0,1],它将使用[1,0]转置它。但是,单独的递归稍后将达到[1,0],并再次使用[0,1]进行转置。最终,所有元素都会回到原来的位置。

为确保元素只转换一次,您可以在切换之前检查x是否小于y:

void CacheObliviousTransposition(int x, int delx, int y, int dely, int N, int* matrix) {
    if ((delx == 1) && (dely == 1)) {
        if(x<y)
        {
            int tmp = matrix[(N*y) + x];
            matrix[(N*y) + x] = matrix[(N*x) + y];
            matrix[(N*x) + y] = tmp;
        }
        return;
    }

    if (delx >= dely) {
        int xmid = delx / 2;
        CacheObliviousTransposition(x, xmid, y, dely, N, matrix);
        CacheObliviousTransposition(x + xmid, delx - xmid, y, dely, N, matrix);
        return;
    }

    int ymid = dely / 2;
    CacheObliviousTransposition(x, delx, y, ymid, N, matrix);
    CacheObliviousTransposition(x, delx, y + ymid, dely - ymid, N, matrix);
}