优化btwoc()

时间:2011-08-28 04:55:33

标签: c openid

根据OpenID Authentication 2.0,第4.2节,

  

任意精度整数必须编码为big-endian signed二进制补码二进制字符串。此后,“btwoc”是一个函数,它接受任意精度整数并返回其最短的big-endian二进制补码表示。与Diffie-Hellman密钥交换一起使用的所有整数都是正数。这意味着二进制补码表示的最左位必须为零。如果不是,实现必须在字符串的前面添加一个零字节。

     

非规范性示例:

Base 10 number | btwoc string representation
---------------+----------------------------
0              | "\x00"
127            | "\x7F"
128            | "\x00\x80"
255            | "\x00\xFF"
32768          | "\x00\x80\x00"

我尝试在C中编写自己的btwoc实现,这就是我所拥有的:

typedef struct {
    uint8_t *data;
    uintmax_t length;
} oid_data;

oid_data *oid_btwoc_r(uintmax_t value, oid_data *ret) {
    unsigned    fnz = sizeof(uintmax_t) + 1,
                i = sizeof(uintmax_t) * 8;
    while (--fnz && (!(value >> (i -= 8) & 0xFF)));
    /*
        If `value' is non-zero, `fnz' now contains the index of the first
        non-zero byte in `value', where 1 refers to the least-significant byte.
        `fnz' will therefore be in the range [1 .. sizeof(uintmax_t)]. If
        `value' is zero, then `fnz' is zero.
    */
    if (!value) {
        /* The value is zero */
        ret->length = 1;
        ret->data[0] = 0;
    } else if (value >> ((fnz - 1) * 8 + 7)) {
        /* The most significant bit of the first non-zero byte is 1 */
        ret->length = fnz + 1;
        ret->data[0] = 0;
        for (i = 1; i <= fnz; i++)
            ret->data[1 + fnz - i] =
                value >> ((i - 1) * 8);
    } else {
        /* The most significant bit of the first non-zero byte is 0 */
        ret->length = fnz;
        for (i = 1; i <= fnz; i++)
            ret->data[fnz - i] =
                value >> ((i - 1) * 8);
    }
    return ret;
}

ret->data应指向至少sizeof(uintmax_t) + 1字节的有效内存。

它工作正常,我还没有在实现中发现任何错误,但它可以进行优化吗?

1 个答案:

答案 0 :(得分:1)

如果你保持零作为特例,你应该在while循环之前处理它。 (个人bêtenoire:我不喜欢(!value)的{​​{1}}。)


我还没有尝试计时(但是),但是这段代码会产生与你相同的答案(至少在我测试的值上看到;见下文)并且对我来说看起来更简单,尤其是因为它没有加倍用于处理在最高有效字节中设置高位的情况的代码:

(value == 0)

我用来测试你的版本的代码是:

void oid_btwoc_r2(uintmax_t value, oid_data *ret)
{
    if (value == 0)
    {
        ret->data[0] = 0;
        ret->length  = 1;
        return;
    }

    uintmax_t v0 = value;
    uintmax_t v1 = v0;
    unsigned  n  = 0;

    while (v0 != 0)
    {
        n++;
        v1 = v0;
        v0 >>= 8;
    }
    //printf("Value: 0x%" PRIXMAX ", v1 = 0x%" PRIXMAX "\n", value, v1);

    assert(v1 < 0x100 && v1 != 0);
    if (v1 > 0x7F)
        n++;
    //printf("N = %u\n", n);

    for (unsigned i = n; i != 0; i--)
    {
        ret->data[i-1] = (value & 0xFF);
        value >>= 8;
    }
    ret->length = n;
}

是;代码的早期版本需要dump函数来查看出错的地方!

时序

我将原始#include <assert.h> #include <inttypes.h> #include <stdio.h> #include <string.h> #define DIM(x) (sizeof(x) / sizeof(*(x))) static void dump_oid_data(FILE *fp, const char *tag, const oid_data *data) { fprintf(fp, "%s: length %" PRIuMAX ":", tag, data->length); for (uintmax_t i = 0; i < data->length; i++) fprintf(fp, " 0x%02X", data->data[i]); fputc('\n', fp); } int main(void) { uintmax_t list[] = { 0, 0x7F, 0x80, 0xFF, 0x100, 0x7FFF, 0x8000, 0xFFFF, 0x10000, 0x7FFFFF, 0x800000, 0xFFFFFF }; for (size_t i = 0; i < DIM(list); i++) { uint8_t b1[sizeof(uintmax_t) + 1]; uint8_t b2[sizeof(uintmax_t) + 1]; oid_data v1 = { b1, sizeof(b1) }; oid_data v2 = { b2, sizeof(b2) }; oid_btwoc_r(list[i], &v1); oid_btwoc_r2(list[i], &v2); printf("Value: 0x%" PRIXMAX ": ", list[i]); if (v1.length != v2.length) printf("Lengths differ (%" PRIuMAX " vs %" PRIuMAX ")\n", v1.length, v2.length); else if (memcmp(v1.data, v2.data, v1.length) != 0) { printf("Data differs!\n"); dump_oid_data(stdout, "oid_btwoc_r1()", &v1); dump_oid_data(stdout, "oid_btwoc_r2()", &v2); } else { printf("Values are the same\n"); dump_oid_data(stdout, "oid_btwoc_rN()", &v2); } } return(0); } 修改为返回oid_btwoc_r()(无最终void)的函数,并将其重命名为return。然后我进行了计时测试(在运行MacOS X Lion 10.7.1的Mac Mini上),并获得了计时结果:

oid_btwoc_r1()

因此,似乎oid_btwoc_r1(): 4.925386 oid_btwoc_r2(): 4.022604 oid_btwoc_r1(): 4.930649 oid_btwoc_r2(): 4.004344 oid_btwoc_r1(): 4.927602 oid_btwoc_r2(): 4.005756 oid_btwoc_r1(): 4.923356 oid_btwoc_r2(): 4.007910 oid_btwoc_r1(): 4.984037 oid_btwoc_r2(): 4.202986 oid_btwoc_r1(): 5.015747 oid_btwoc_r2(): 4.067265 oid_btwoc_r1(): 4.982333 oid_btwoc_r2(): 4.019807 oid_btwoc_r1(): 4.957866 oid_btwoc_r2(): 4.074712 oid_btwoc_r1(): 4.993991 oid_btwoc_r2(): 4.042422 oid_btwoc_r1(): 4.970930 oid_btwoc_r2(): 4.077203 函数比原始函数快20%左右 - 至少在测试数据上如此。更大的数字改变了平衡,有利于oid_btwoc_r2(),而oid_btwoc_r1()则快了大约20%:

oid_btwoc_r()

由于在这种情况下,大数字可能比小数字更可能,oid_btwoc_r1(): 3.671201 oid_btwoc_r2(): 4.605171 oid_btwoc_r1(): 3.669026 oid_btwoc_r2(): 4.575745 oid_btwoc_r1(): 3.673729 oid_btwoc_r2(): 4.659433 oid_btwoc_r1(): 3.684662 oid_btwoc_r2(): 4.671654 oid_btwoc_r1(): 3.730757 oid_btwoc_r2(): 4.645485 oid_btwoc_r1(): 3.764600 oid_btwoc_r2(): 4.673244 oid_btwoc_r1(): 3.669582 oid_btwoc_r2(): 4.610177 oid_btwoc_r1(): 3.664248 oid_btwoc_r2(): 4.813711 oid_btwoc_r1(): 3.675927 oid_btwoc_r2(): 4.630148 oid_btwoc_r1(): 3.681798 oid_btwoc_r2(): 4.614129 - 或原始oid_btwoc_r1() - 可以说是更好的选择。

测试代码如下。取消注释的oid_btwoc_r()循环是“大数”版本,其显示for的工作速度比oid_btwoc_r1()快;已注释的oid_btwoc_r2()循环是“小数字”版本,其显示for的工作速度比oid_btwoc_r2()快。

oid_btowc_r1()