我正在进行strchr实现并尝试优化我的代码以使用我的机器的64位属性。所以,我'将我的字符串转换为长整数,从而一次比较8个字符。
目前,我有:
int has_char(unsigned long word, unsigned long c)
{
if((word & 0xFF00000000000000) == (c << 56) ) return (1);
if((word & 0x00FF000000000000) == (c << 48)) return (1);
if((word & 0x0000FF0000000000) == (c << 40)) return (1);
if((word & 0x000000FF00000000) == (c << 32)) return (1);
if((word & 0x00000000FF000000) == (c << 24)) return (1);
if((word & 0x0000000000FF0000) == (c << 16)) return (1);
if((word & 0x000000000000FF00) == (c << 8)) return (1);
if((word & 0x00000000000000FF) == c) return (1);
return (0); /* Not found, returning 0 */
}
char strchr(const char *s, int c)
{
const char *curr;
const long *str;
unsigned long wd;
str = (long *)s;
while (1) {
wd = *str;
if (has_char(wd, (unsigned long)c)) {
curr = (char *)str;
while (*curr) {
if (*curr == (char)c)
return ((char *)curr);
curr++;
}
}
if ((wd - 0x0101010101010101)
& ~wd & 0x8080808080808080) /* End of string and character not found, exit */
return (NULL);
str++;
}
}
它运行良好,但我的has_char非常低效,它测试8次字符值。有没有办法进行一个独特的测试(一个掩码?)如果字符出现在单词中将返回1,如果不存在则返回0?
感谢您的帮助!
答案 0 :(得分:3)
很好,这是所要求的精确代码:
// Return a non-zero mask if any of the bytes are zero/null, as per your original code
inline uint64_t any_zeroes(uint64_t value) {
return (value - 0x0101010101010101) & ~value & 0x8080808080808080;
}
char *strchr(const char *s, int ch) {
// Pre-generate a 64-bit comparison mask with the character at every byte position
uint64_t mask = (unsigned char) ch * 0x0101010101010101;
// Access the string 64-bits at a time.
// Beware of alignment requirements on most platforms.
const uint64_t *word_ptr = (const uint64_t *) s;
// Search through the string in 8-byte chunks looking for either any character matches
// or any null bytes
uint64_t value;
do {
value = *word_ptr++:
// The exclusive-or value ^ mask will give us 0 in any byte field matching the char
value = any_zeroes(value) | any_zeroes(value ^ mask);
} while(!value);
// Wind-down and locate the final character. This may be done faster by looking at the
// previously generated zero masks but doing so requires us to know the byte-order
s = (const char *) --word_ptr;
do {
if(*s == (char) ch)
return (char *) s;
} while(*s++);
return NULL;
}
小心:写下我的头脑。
答案 1 :(得分:1)
首先,构建一个新的变量c8,它在每个位置都是一个c。
unsigned long c8= (c << 56L) | ( c << 48L ) | ... | ( c << 8 ) | c ;
在循环外执行此操作,这样您就不会重新计算。
然后将c8
与word
进行异或,并将每个字节测试为零。要并行执行此操作,有几个选择:
如果你想变得丑陋,我们可以开始做一些平行的崩溃。首先让每个字节将所有一个折叠成一个点:
unsigned long ltmp ;
ltmp= word | (0xf0f0f0f0f0f0f0f0 & ( word >> 4 )) ;
ltmp &= 0x0f0f0f0f0f0f0f0f ;
ltmp |= ( ltmp >> 2 ) ;
ltmp |= ( ltmp >> 1 ) ;
ltmp &= 0x0101010101010101 ;
return ( ltmp != 0x0101010101010101 ) ;
或者,评论是以下测试:
((wd - 0x0101010101010101) & ~wd & 0x8080808080808080)
等同于之前的所有操作。
BTW,表格if (a) return 1 ; return 0 ;
只能写成return a ;
或return a != 0 ;