Question

我正在尝试对当前表示为JavaCard智能卡中8个字节的字节数组的64位字进行任意旋转（ROTL）操作。

丑陋的方法是将ROTL的所有64种可能排列硬编码为一个8字节数组表示的64位字，但这只会使整个代码库膨胀。

如何使其更精简，以便可以使用byte和{{即时对64位字（字节数组）按需执行任意数量的ROTL操作1}}类型（由于JavaCard无法识别short或int等更复杂的事物），而无需对所有ROTL64排列进行硬编码。

Answer 1

以下方法对缓冲区中的任何类型的数组执行右旋转，不需要额外的输出或临时数组

：

/**
 * Rotates the indicated bytes in the given buffer to the right by a certain
 * number of bits.
 * 
 * @param buf
 *            the buffer in which the bits need to be rotated
 * @param off
 *            the offset in the buffer where the rotation needs to start
 * @param len
 *            the amount of bytes
 * @param rot
 *            the amount to rotate (any 31 bit value allowed)
 */
public static void rotr(byte[] buf, short off, short len, short rot) {
    if (len == 0) {
        // nothing to rotate (avoid division by 0)
        return;
    }

    final short lenBits = (short) (len * BYTE_SIZE);
    // doesn't always work for edge cases close to MIN_SHORT / MAX_SHORT
    rot = (short) ((rot + lenBits) % lenBits);

    // reused variables for byte and bit shift
    short shift, i;
    byte t1, t2;

    // --- byte shift

    shift = (short) (rot / BYTE_SIZE);

    // only shift when actually required
    if (shift != 0) {

        // values will never be used, src == start at the beginning
        short start = -1, src = -1, dest;

        // compiler is too stupid to see t1 will be assigned anyway
        t1 = 0;

        // go over all the bytes, but in stepwise fashion
        for (i = 0; i < len; i++) {
            // if we get back to the start
            // ... then we need to continue one position to the right
            if (src == start) {
                start++;
                t1 = buf[(short) (off + (++src))];
            }

            // calculate next location by stepping by the shift amount
            // ... modulus the length of course
            dest = (short) ((src + shift) % len);

            // save value, this will be the next one to be stored
            t2 = buf[(short) (off + dest)];
            // store value, doing the actual shift
            buf[(short) (off + dest)] = t1;

            // do the step
            src = dest;
            // we're going to store t1, not t2
            t1 = t2;
        }
    }

    // --- bit shift

    shift = (short) (rot % BYTE_SIZE);

    // only shift when actually required
    if (shift != 0) {

        // t1 holds previous byte, at other side
        t1 = buf[(short) (off + len - 1)];
        for (i = 0; i < len; i++) {
            t2 = buf[(short) (off + i)];
            // take bits from previous byte and this byte together
            buf[(short) (off + i)] = (byte) ((t1 << (BYTE_SIZE - shift)) | ((t2 & BYTE_MASK) >> shift));
            // current byte is now previous byte
            t1 = t2;
        }
    }
}

private static final short BYTE_MASK = 0xFF;
private static final short BYTE_SIZE = 8;

缺点是，它需要对所有数据进行两次传递：一个字节移位，另一个字节移位。当不需要它们时，它将跳过它们（如果您知道从未执行过跳过操作，则可以轻松删除这些检查）。

这是左转。左旋转本身很难实现-需要更多计算，因此可能会抵消其他方法调用的成本。如果您使用文字旋转，则当然可以使用rotr函数，或者自己计算旋转量。

public static void rotl(byte[] buf, short off, short len, short bits) {
    final short lenBits = (short) (len * BYTE_SIZE);
    bits = (short) ((bits + lenBits) % lenBits);
    // we don't care if we pass 0 or lenBits, rotr will adjust
    rotr(buf, off, len, (short) (lenBits - bits));
}

注意：以前的版本要求旋转64位，时间常数更大，并且不包含偏移量。它还需要一个具有循环常数的64位特定数组（现在已由字节移位的if循环中更通用的内部for语句代替）。查看其他版本的修改。

当有输出缓冲区可用时，旋转变得容易得多：此实现可以仅由初始化部分和最后4行代码组成。密码通常只移位一个固定大小和一个奇数，因此只能使用最后4行，而无需进行任何优化（如果不需要进行移位）。

我特意使用了一个稍有不同的接口，该接口假定旋转使用64位，只是为了展示一个稍有不同的实现。

public static void rotr64(byte[] inBuf, short inOff, byte[] outBuf, short outOff, short rot) {
    short byteRot = (short) ((rot & 0b00111000) >> 3); 
    short bitRot = (short) (rot & 0b00000111); 

    if (bitRot == 0) {

        if (byteRot == 0) {
            // --- no rotation
            return;
        }

        // --- only byte rotation
        for (short i = 0; i < LONG_BYTES; i++) {
            outBuf[(short) (outOff + (i + byteRot) % LONG_BYTES)] = inBuf[(short) (inOff + i)];
        }
    } else {
        // --- bit- and possibly byte rotation
        // note: also works for all other situations, but slower

        // put the last byte in t_lo
        short t = (short) (inBuf[inOff + LONG_BYTES - 1] & BYTE_MASK);
        for (short i = 0; i < LONG_BYTES; i++) {
            // shift t_lo into t_hi and add the next byte into t_lo
            t = (short) (t << BYTE_SIZE | (inBuf[(short) (inOff + i)] & BYTE_MASK));
            // find the byte to receive the shifted value within the short 
            outBuf[(short) (outOff + (i + byteRot) % LONG_BYTES)] = (byte) (t >> bitRot); 
        }
    }
}

private static final int LONG_BYTES = 8;
private static final short BYTE_MASK = 0xFF;
private static final short BYTE_SIZE = 8;

如果将偏移量始终设置为零，则可以进一步简化。

如果需要通用功能，请在此处向左旋转：

public static void rotl64(byte[] inBuf, short inOff, byte[] outBuf, short outOff, short rot) {
    rotr64(inBuf, inOff, outBuf, outOff, (short) (64 - rot & 0b00111111));
}

所有内容都针对随机输入进行了测试（一百万次左右运行，在Java SE上花费的时间不到一秒钟），尽管我不对测试提供任何保证。请测试一下自己。

Answer 2

一个非常简单的实现，它在单独的参数中接收四个短裤

public static void rotateRight64(short x3, short x2, short x1, short x0,
                                 short rotAmount, short[] out)
{
    assert out.length() == 4;
    rotAmount &= (1 << 6) - 1;  // limit the range to 0..63
    if (rotAmount >= 16)
        rotateRight64(x0, x3, x2, x1, rotAmount - 16, out);
    else
    {
        out[0] = (short)((x0 >>> rotAmount) | (x1 << (16-rotAmount)));
        out[1] = (short)((x1 >>> rotAmount) | (x2 << (16-rotAmount)));
        out[2] = (short)((x2 >>> rotAmount) | (x3 << (16-rotAmount)));
        out[3] = (short)((x3 >>> rotAmount) | (x0 << (16-rotAmount)));
    }
}

这是向右旋转，但通过向右旋转64 - rotAmount

很容易向左旋转

或者可以像这样完成，而无需进行粗略的移动

public static void rotateRight(short[] out, short[] in, short rotAmount) // in ror rotAmount
{
    assert out.length() == 4 && in.length() == 4 && rotAmount >= 0 && rotAmount < 64;

    const short shift     = (short)(rotAmount % 16);
    const short limbshift = (short)(rotAmount / 16);

    short tmp = in[0];
    for (short i = 0; i < 4; ++i)
    {
        short index = (short)((i + limbshift) % 4);
        out[i]  = (short)((in[index] >>> shift) | (in[index + 1] << (16 - shift)));
    }
}

这样，可以轻松地将其更改为任意精度的平移/旋转

如果输入数组为byte，则可以将short[4]更改为byte[8]，并将所有常量从16→8和4→8更改。实际上，可以将它们广义化而无需问题，我只是很难编码以使其简单易懂

向左旋转JavaCard中的64位字字节数组

2 个答案: