如何读取或写入交错数组?

时间:2017-09-08 15:58:08

标签: c vectorization

我的数据结构包含8位向量,每位长64位。但是,这些数组的各个字节在数据结构中交错,而不是一个接一个地交错,给定位向量中的每个连续字节在前一个之后是8个字节。有没有一种有效的方法(如并行加载和存储)在现有的x86-64 CPU上在这些交错数组和64位字之间移动数据?嵌入asm的C代码很好,但如果有一个使用gcc内在函数的解决方案会更好。

1 个答案:

答案 0 :(得分:0)

我不知道x64 CPU指令可以处理交错数据。但是,由于这些CPU在移位和索引I / O上非常快,我将使用以下方法进行8个内联移位/复制操作,并将剩下的工作留给优化器:

void Write (unsigned char*     bytes,
            unsigned long long value,
            int                offset)
    {
    bytes [offset     ] = (unsigned char) (value      );
    bytes [offset +  8] = (unsigned char) (value >>  8);
    bytes [offset + 16] = (unsigned char) (value >> 16);
    bytes [offset + 24] = (unsigned char) (value >> 24);
    bytes [offset + 32] = (unsigned char) (value >> 32);
    bytes [offset + 40] = (unsigned char) (value >> 40);
    bytes [offset + 48] = (unsigned char) (value >> 48);
    bytes [offset + 56] = (unsigned char) (value >> 56);
    return;
    }

void Read (unsigned char*      bytes,
           unsigned long long* value,
           int                 offset)
    {
    *value = ((unsigned long long) bytes [offset     ]      ) |
             ((unsigned long long) bytes [offset +  8] <<  8) |
             ((unsigned long long) bytes [offset + 16] << 16) |
             ((unsigned long long) bytes [offset + 24] << 24) |
             ((unsigned long long) bytes [offset + 32] << 32) |
             ((unsigned long long) bytes [offset + 40] << 40) |
             ((unsigned long long) bytes [offset + 48] << 48) |
             ((unsigned long long) bytes [offset + 56] << 56);
    return;
    }

此代码以little-endian顺序存储64位值。对于big-endian,只需按相反的顺序读/写字节。