Question

我尝试使用宏在编译时计算常量C字符串的哈希值。这是我的示例代码：

#include <stddef.h>
#include <stdint.h>

typedef uint32_t hash_t;

#define hash_cstr(s) ({           \
      typeof(sizeof(s)) i = 0;    \
      hash_t h = 5381;            \
      for (; i < sizeof(s) - 1; ) \
        h = h * 33 + s[i++];      \
      h;                          \
    })

/* tests */
#include <stdio.h>

int main() {
#define test(s) printf("The djb2 hash of " #s " is a %u\n", hash_cstr(#s))

  test(POST);
  test(/path/to/file);
  test(Content-Length);
}

现在我运行 GCC 来显示列表：

arm-none-eabi-gcc-4.8 -S -O2 -funroll-loops -o hash_test.S hash_test.c

结果如预期：所有字符串都被删除并被其哈希替换。但通常我使用 -Os 来编译嵌入式应用程序的代码。当我尝试这样做时，我只对少于四个字符的字符串进行哈希处理。我还尝试设置参数max-unroll-times并使用 GCC 4.9 ：

arm-none-eabi-gcc-4.9 -S -Os -funroll-loops \
  --param max-unroll-times=128 -o hash_test.S hash_test.c

我无法理解这种行为的原因以及我如何扩展这四种字符的限制。

Answer 1

我建议将相关代码放在单独的文件中，并使用-O2（而不是-Os）编译该文件。或者添加function specific pragma之类的

 #pragma GCC optimize ("-O2")

在函数之前

，或使用像__attribute__((optimize("02")))这样的function attribute（pure属性也可能相关）

您可能会对__builtin_constant_p感兴趣。

我会使您的散列代码变为static inline函数（可能带有always_inline函数属性），例如

 static inline hash_t hashfun(const char*s) {
    hash_t h = 5381;
    for (const char* p = s; *p; p++) 
      h = h * 33 + *p;
    return h;
 }

更容易（也不那么脆弱）的替代方法是更改构建过程以生成一些C文件（例如，使用简单的awk或python脚本，或者甚至是一个特殊的C程序）包含像

这样的东西

  const char str1[]="POST";
  hash_t hash1=2089437419; // the hash code of str1

不要忘记.c或.h文件可以由其他内容生成（您只需要在Makefile中添加一些规则来生成它们）;如果你的老板对此感到不安，请向他展示metaprogramming wikipage。

Answer 2

如果C ++被赋予模板功能，那就像：

template<int I>
  hash_t hash_rec(const char* str, hash_t h) {
  if( I > 0 ) {
    return hash_rec<I-1>(str, h * 33 + str[I-1]);
  } else {
    return h;
  }
}

#define hash(str) hash_rec<sizeof(str)>(str, 5381)

h = hash(str);

如何强制GCC编译器在编译时使用-Os计算常量

2 个答案: