Question

字符串的平均长度为4个字符。我认为二进制搜索可能是从第4位开始最快的。另外我认为内联模板化函数可能表现良好。这是在一个非常紧凑的循环中完成的，因此性能至关重要。

数据如下：

"1234    "
"ABC     "
"A1235   "
"A1235kgo"

Answer 1

char* found = std::find(arr, arr+9, ' ');

请注意，结束迭代器会发出“不匹配”的信号：

bool match = (arr+9) != found;

注意，

二进制搜索不适用，除非您的字符是某种已知的排序。
std :: find是内联，模板化，如果启用优化（例如，-O3 -march=native为g ++），将执行最大值

编辑，因为您已经显示了更多代码，我现在意识到您确实想要检测（子）字符串长度。你可以用

std::string::find_first_of
std::string::find_last_of
的std :: string ::找到
的std :: string :: RFIND 等

当然，假设你想要将char []转换为std :: string。实际上，这可能是一个非常有效的想法，因为在几乎所有C ++标准库的实现中都发现了SSO（小字符串优化）。（参见Herb Sutter的More Exceptional C++中的第13-16项，或Scott Meyers关于Effective STL中商业std :: string实现的讨论。）

Answer 2

您确实可以使用二分查找来查找第一个空格字符（在本例中使用std::lower_bound(...)）：

const char *data= ...;// 8 character string to search

const char *end= std::lower_bound(data, data + 8, ' ', [](char lhs, char rhs)
{
    bool lhs_is_space= lhs==' ';
    bool rhs_is_space= rhs==' ';

    return lhs_is_space < rhs_is_space;
});

哪个有效地使用二分搜索来查找第一个空格字符。基本思想是假装非空格字符为false，空格字符为true，并进一步假设所有非空格字符都位于空格字符之前。如果这是真的，那么序列将根据此分类进行排序，我们可以简单地找到空间字符运行的开始（下限）。

Answer 3

由于空格都在最后，您可以使用展开的二进制搜索。然而，定期的线性搜索奇迹般地接近速度，并且不会让未来的开发者讨厌你。

inline int find_space(char (&data)[9]) {
    if (data[3] == ' ') {
        if (data[1] == ' ') {
            if (data[0] == ' ')
                return 0;
            return 1;
        } else if (data[2] == ' ')
            return 2;
        return 3; 
    }
    if (data[5] == ' ') {
        if (data[4] == ' ')
            return 4;
        return 5;
    } else if (data[7] == ' ') {
        if (data[6] == ' ')
            return 6;
        return 7;
    } else if (data[8] == ' ')
        return 8;
    return -1;
}

Answer 4

允许编译器和优化器完成其工作。

inline
template <typename T_CHAR, int N>
T_CHAR* find_first_of(T_CHAR a[N], T_CHAR t)
{
    for (int ii = 0; ii < N; ++ii)
    {
        if (a[ii] == t) { return a+ii; }
    }
    return NULL;
}

或允许标准模板库作者为您完成所有繁重工作。

inline
template <typename T_CHAR, int N>
T_CHAR* find_first_of(T_CHAR a[N], T_CHAR t)
{
    T_CHAR* ii = std::find(a, a+N, t);
    if (ii == a+N) return NULL;
    return ii;
}

Answer 5

只需2美分。我想所有字符串的长度都是8.可能的字符'A' - 'Z'， 'a' - 'z'，'0' - '9'和空格。我试过了：

//simple
const char *found = std::find(x.data, x.data + 9, ' ');
//binary search
const char *end = std::lower_bound(x.data, x.data + 8, ' ', [](char lhs, char rhs) {

和我的优化版本（取决于编译器== gcc）（见下文）。我在Linux 64bit上测试过，-O3 -march = native -std = c ++ 0x。随机生成的50000000字符串的结果：

简单取0.480000，
优化取0.120000，
二分搜索需要0.600000。

union FixedLenStr {
     unsigned char chars[8];
     uint32_t words[2];
     uint64_t  big_word;
};

static int space_finder(const char *str) 
{   
        FixedLenStr tmp;

        memcpy(tmp.chars, str, 8);

        tmp.big_word &= 0xF0F0F0F0F0F0F0F0ull;
        tmp.big_word >>= 4;
        tmp.big_word = (0x0707070707070707ull - tmp.big_word) * 26;
        tmp.big_word &= 0x8080808080808080ull;      

        return (__builtin_ffsll(tmp.big_word) >> 3) - 1;    
}

C ++在固定大小为9的右边填充空终止字符数组中查找第一个空格的最快方法

5 个答案: