在C ++中搜索搜索和替换的圣杯

时间:2015-12-21 18:55:30

标签: c++ algorithm boost mfc boost-spirit-qi

最近我一直在寻找一种方法来替换字符串中的标记本质上是查找和替换(但至少还有一种方法可以解决这个问题)并且看起来非常平庸。我带来了几个可能的实现,但从性能的角度来看,它们都没有令人满意。最佳成就是每次迭代约50us。案例是理想的,字符串的大小从未增长,最初我省略了不区分大小写的要求
以下是Coliru的代码
我机器的结果:
Boost.Spirit符号结果:3421?= 3421
100000次循环耗时6060ms。
Boyer-Moore结果:3421?= 3421
100000次循环耗时5959ms。
Boyer Moore Hospool结果:3421?= 3421
100000次循环耗时5008ms。
Knuth Morris Pratt结果:3421?= 3421
100000次循环耗时12451ms。
天真的STL搜索和替换结果:3421?= 3421
100000个循环需要5532ms。
Boost replace_all结果:3421?= 3421
100000次循环耗时4860ms。

所以问题是,在如此简单的任务中需要这么长时间?可以说,好的,简单的任务,继续并更好地实施它。但实际情况是,15年前的MFC天真实施使任务的数量级更快:

CString FillTokenParams(const CString& input, const std::unordered_map<std::string, std::string>& tokens)
{
    CString tmpInput = input;
    for(const auto& token : tokens)
    {
        int pos = 0;
        while(pos != -1)
        {
            pos = tmpInput.Find(token.first.c_str(), pos);
            if(pos != -1)
            {
                int tokenLength = token.first.size();
                tmpInput.Delete(pos, tokenLength);
                tmpInput.Insert(pos, token.second.c_str());
                pos += 1;
            }
        }
    }

    return tmpInput;
}

结果:
MFC天真的搜索和替换结果:3421?= 3421
100000次循环需要516ms。
为什么这个笨拙的代码优于现代C ++?为什么其他实现如此缓慢?我错过了一些基本的东西吗?

EDIT001:我已经投入了这个问题,代码 profiled 并进行了三次检查。你可能对此不满意,但是std :: string :: replace不是花时间的。在任何STL实现中,搜索是占用大部分时间,提升精神浪费时间分配tst(我猜测评估树中的测试节点)。我不希望有人指向某个功能中的一行&#34;这是你的问题&#34;而且,这个问题已经消失了。问题是MFC如何能够以相同的速度快10倍地完成同样的工作。

EDIT002:刚刚深入研究了MFC的Find实现并编写了一个模仿MFC实现的函数

namespace mfc
{
std::string::size_type Find(const std::string& input, const std::string& subString, std::string::size_type start)
{
    if(subString.empty())
    {
        return std::string::npos;
    }

    if(start < 0 || start > input.size())
    {
        return std::string::npos;
    }

    auto found = strstr(input.c_str() + start, subString.c_str());
    return ((found == nullptr) ? std::string::npos : std::string::size_type(found - input.c_str()));
}
}

std::string MFCMimicking(const std::string& input, const std::unordered_map<std::string, std::string>& tokens)
{
    auto tmpInput = input;
    for(const auto& token : tokens)
    {
        auto pos = 0;
        while(pos != std::string::npos)
        {
            pos = mfc::Find(tmpInput, token.first, pos);
            if(pos != std::string::npos)
            {
                auto tokenLength = token.first.size();
                tmpInput.replace(pos, tokenLength, token.second.c_str());
                pos += 1;
            }
        }
    }

    return tmpInput;
}

结果:
MFC模仿扩展结果:3421?= 3421
100000次循环耗时411ms。
意思是4us。每次通话,去击败C strstr

EDIT003:使用-Ox编译并运行

  

MFC模仿扩展结果:3421?= 3421   
100000次循环需要660ms。
MFC   天真的搜索和替换结果:3421?= 3421
100000个循环需要856ms。   
手动扩展结果:3421?= 3421
100000个循环耗时1995ms。
博耶 - 穆尔   结果:3421?= 3421
100000个循环耗时6911ms。
Boyer Moore Hospool   结果:3421?= 3421
100000次循环耗时5670ms。
Knuth Morris Pratt   结果:3421?= 3421
100000个循环耗时13825ms。
天真的STL搜索和   替换结果:3421?= 3421
100000个循环耗时9531ms。
升压   replace_all结果:3421?= 3421
100000个周期需要8996ms。

用-O2运行(如在原始测量中)但是10k周期

  

MFC模拟扩展结果:3421?= 3421
万个循环   104MS。
MFC天真的搜索和替换结果:3421?= 3421
10000   周期需要105毫秒。
手动扩展结果:3421?= 3421
10000   周期为356毫秒。
Boyer-Moore结果:3421?= 3421
10000次循环   花了1355ms。
Boyer Moore Hospool结果:3421?= 3421
10000   周期为1101ms。
Knuth Morris Pratt结果:3421?= 3421   
10000次循环耗时1973ms。
天真的STL搜索和替换结果:   3421?= 3421
10000个循环耗时923ms。
提升replace_all   结果:3421?= 3421
10000个循环耗时880ms。

6 个答案:

答案 0 :(得分:5)

所以,我对Qi版本有一些看法。

还创建了X3版本。

最后,写了一个手动扩展功能,击败了所有其他候选者(我希望它比MFC更快,因为它不会因重复的删除/插入而烦恼)。

如果需要,请跳至基准图表。

关于齐版

  1. 是的,符号表遭受基于节点的容器的位置问题。它们可能不是你在这里可以使用的最佳匹配。
  2. 无需重建每个循环的符号:
  3. 而不是按字符顺序跳过非符号,扫描到下一个:

    +(bsq::char_ - symbols)
    
  4. inline std::string spirit_qi(const std::string& input, bsq::symbols<char, std::string> const& symbols)
    {
        std::string retVal;
        retVal.reserve(input.size() * 2);
    
        auto beg = input.cbegin();
        auto end = input.cend();
    
        if(!bsq::parse(beg, end, *(symbols | +(bsq::char_ - symbols)), retVal))
            retVal = input;
    
        return retVal;
    }
    

    这已经快得多了。但是:

    手动循环

    在这个简单的例子中,你为什么不手动解析呢?

    inline std::string manual_expand(const std::string& input, TokenMap const& tokens)
    {
        std::ostringstream builder;
        auto expand = [&](auto const& key) {
            auto match = tokens.find(key);
            if (match == tokens.end())
                builder << "$(" << key << ")";
            else
                builder << match->second;
        };
    
        builder.str().reserve(input.size()*2);
    
        builder.str("");
        std::ostreambuf_iterator<char> out(builder);
    
        for(auto f(input.begin()), l(input.end()); f != l;) {
            switch(*f) {
                case '$' : {
                        if (++f==l || *f!='(') {
                            *out++ = '$';
                            break;
                        }
                        else {
                            auto s = ++f;
                            size_t n = 0;
    
                            while (f!=l && *f != ')')
                                ++f, ++n;
    
                            // key is [s,f] now
                            expand(std::string(&*s, &*s+n));
    
                            if (f!=l)
                                ++f; // skip '}'
                        }
                    }
                default:
                    *out++ = *f++;
            }
        }
        return builder.str();
    }
    

    这在我的机器上的性能非常优越。

    其他想法

    您可以使用静态生成的令牌表来查看Boost Spirit Lex:http://www.boost.org/doc/libs/1_60_0/libs/spirit/doc/html/spirit/lex/abstracts/lexer_static_model.html。我不是特别喜欢Lex。

    COMPARISONS:

    enter image description here

    请参阅 Interactive Chart

    使用Nonius作为基准统计信息。

    完整基准代码:http://paste.ubuntu.com/14133072/

    #include <boost/container/flat_map.hpp>
    
    #define USE_X3
    #ifdef USE_X3
    #   include <boost/spirit/home/x3.hpp>
    #else
    #   include <boost/spirit/include/qi.hpp>
    #endif
    
    #include <boost/algorithm/string.hpp>
    #include <boost/algorithm/searching/boyer_moore.hpp>
    #include <boost/algorithm/searching/boyer_moore_horspool.hpp>
    #include <boost/algorithm/searching/knuth_morris_pratt.hpp>
    #include <string>
    #include <unordered_map>
    #include <iostream>
    #include <fstream>
    #include <nonius/benchmark.h++>
    #include <nonius/main.h++>
    
    using TokenMap = boost::container::flat_map<std::string, std::string>;
    
    #ifdef USE_X3
        namespace x3  = boost::spirit::x3;
    
        struct append {
            std::string& out;
            void do_append(char const ch) const                       { out += ch;                      } 
            void do_append(std::string const& s)  const               { out += s;                       } 
            template<typename It>
            void do_append(boost::iterator_range<It> const& r)  const { out.append(r.begin(), r.end()); } 
            template<typename Ctx>
            void operator()(Ctx& ctx) const                           { do_append(_attr(ctx));          } 
        };
    
        inline std::string spirit_x3(const std::string& input, x3::symbols<char const*> const& symbols)
        {
            std::string retVal;
            retVal.reserve(input.size() * 2);
            append appender { retVal };
    
            auto beg = input.cbegin();
            auto end = input.cend();
    
            auto rule = *(symbols[appender] | x3::char_ [appender]);
    
            if(!x3::parse(beg, end, rule))
                retVal = input;
    
            return retVal;
        }
    #else
        namespace bsq = boost::spirit::qi;
    
        inline std::string spirit_qi_old(const std::string& input, TokenMap const& tokens)
        {
            std::string retVal;
            retVal.reserve(input.size() * 2);
            bsq::symbols<char const, char const*> symbols;
            for(const auto& token : tokens) {
                symbols.add(token.first.c_str(), token.second.c_str());
            }
    
            auto beg = input.cbegin();
            auto end = input.cend();
    
            if(!bsq::parse(beg, end, *(symbols | bsq::char_), retVal))
                retVal = input;
    
            return retVal;
        }
    
        inline std::string spirit_qi(const std::string& input, bsq::symbols<char, std::string> const& symbols)
        {
            std::string retVal;
            retVal.reserve(input.size() * 2);
    
            auto beg = input.cbegin();
            auto end = input.cend();
    
            if(!bsq::parse(beg, end, *(symbols | +(bsq::char_ - symbols)), retVal))
                retVal = input;
    
            return retVal;
        }
    #endif
    
    inline std::string manual_expand(const std::string& input, TokenMap const& tokens) {
        std::ostringstream builder;
        auto expand = [&](auto const& key) {
            auto match = tokens.find(key);
    
            if (match == tokens.end())
                builder << "$(" << key << ")";
            else
                builder << match->second;
        };
    
        builder.str().reserve(input.size()*2);
        std::ostreambuf_iterator<char> out(builder);
    
        for(auto f(input.begin()), l(input.end()); f != l;) {
            switch(*f) {
                case '$' : {
                        if (++f==l || *f!='(') {
                            *out++ = '$';
                            break;
                        }
                        else {
                            auto s = ++f;
                            size_t n = 0;
    
                            while (f!=l && *f != ')')
                                ++f, ++n;
    
                            // key is [s,f] now
                            expand(std::string(&*s, &*s+n));
    
                            if (f!=l)
                                ++f; // skip '}'
                        }
                    }
                default:
                    *out++ = *f++;
            }
        }
        return builder.str();
    }
    
    inline std::string boost_replace_all(const std::string& input, TokenMap const& tokens)
    {
        std::string retVal(input);
        retVal.reserve(input.size() * 2);
    
        for(const auto& token : tokens)
        {
            boost::replace_all(retVal, token.first, token.second);
        }
        return retVal;
    }
    
    inline void naive_stl(std::string& input, TokenMap const& tokens)
    {
        input.reserve(input.size() * 2);
        for(const auto& token : tokens)
        {
            auto next = std::search(input.cbegin(), input.cend(), token.first.begin(), token.first.end());
            while(next != input.cend())
            {
                input.replace(next, next + token.first.size(), token.second);
                next = std::search(input.cbegin(), input.cend(), token.first.begin(), token.first.end());
            }
        }
    }
    
    inline void boyer_more(std::string& input, TokenMap const& tokens)
    {
        input.reserve(input.size() * 2);
        for(const auto& token : tokens)
        {
            auto next =
                boost::algorithm::boyer_moore_search(input.cbegin(), input.cend(), token.first.begin(), token.first.end());
            while(next != input.cend())
            {
                input.replace(next, next + token.first.size(), token.second);
                next = boost::algorithm::boyer_moore_search(input.cbegin(), input.cend(), token.first.begin(),
                                                            token.first.end());
            }
        }
    }
    
    inline void bmh_search(std::string& input, TokenMap const& tokens)
    {
        input.reserve(input.size() * 2);
        for(const auto& token : tokens)
        {
            auto next = boost::algorithm::boyer_moore_horspool_search(input.cbegin(), input.cend(), token.first.begin(),
                                                                      token.first.end());
            while(next != input.cend())
            {
                input.replace(next, next + token.first.size(), token.second);
                next = boost::algorithm::boyer_moore_search(input.cbegin(), input.cend(), token.first.begin(),
                                                            token.first.end());
            }
        }
    }
    
    inline void kmp_search(std::string& input, TokenMap const& tokens)
    {
        input.reserve(input.size() * 2);
        for(const auto& token : tokens)
        {
            auto next = boost::algorithm::knuth_morris_pratt_search(input.cbegin(), input.cend(), token.first.begin(),
                                                                    token.first.end());
            while(next != input.cend())
            {
                input.replace(next, next + token.first.size(), token.second);
                next = boost::algorithm::boyer_moore_search(input.cbegin(), input.cend(), token.first.begin(),
                                                            token.first.end());
            }
        }
    }
    
    namespace testdata {
        std::string const expected =
            "Five and Seven said nothing, but looked at Two. Two began in a low voice, 'Why the fact is, you see, Miss, "
            "this here ought to have been a red rose-tree, and we put a white one in by mistake; and if the Queen was to "
            "find it out, we should all have our heads cut off, you know. So you see, Miss, we're doing our best, afore "
            "she comes, to—' At this moment Five, who had been anxiously looking across the garden, called out 'The Queen! "
            "The Queen!' and the three gardeners instantly threw themselves flat upon their faces. There was a sound of "
            "many footsteps, and Alice looked round, eager to see the Queen.First came ten soldiers carrying clubs; these "
            "were all shaped like the three gardeners, oblong and flat, with their hands and feet at the corners: next the "
            "ten courtiers; these were ornamented all over with diamonds, and walked two and two, as the soldiers did. "
            "After these came the royal children; there were ten of them, and the little dears came jumping merrily along "
            "hand in hand, in couples: they were all ornamented with hearts. Next came the guests, mostly Kings and "
            "Queens, and among them Alice recognised the White Rabbit: it was talking in a hurried nervous manner, smiling "
            "at everything that was said, and went by without noticing her. Then followed the Knave of Hearts, carrying "
            "the King's crown on a crimson velvet cushion; and, last of all this grand procession, came THE KING AND QUEEN "
            "OF HEARTS.Alice was rather doubtful whether she ought not to lie down on her face like the three gardeners, "
            "but she could not remember ever having heard of such a rule at processions; 'and besides, what would be the "
            "use of a procession,' thought she, 'if people had all to lie down upon their faces, so that they couldn't see "
            "it?' So she stood still where she was, and waited.When the procession came opposite to Alice, they all "
            "stopped and looked at her, and the Queen said severely 'Who is this?' She said it to the Knave of Hearts, who "
            "only bowed and smiled in reply.'Idiot!' said the Queen, tossing her head impatiently; and, turning to Alice, "
            "she went on, 'What's your name, child?''My name is Alice, so please your Majesty,' said Alice very politely; "
            "but she added, to herself, 'Why, they're only a pack of cards, after all. I needn't be afraid of them!''And "
            "who are these?' said the Queen, pointing to the three gardeners who were lying round the rosetree; for, you "
            "see, as they were lying on their faces, and the pattern on their backs was the same as the rest of the pack, "
            "she could not tell whether they were gardeners, or soldiers, or courtiers, or three of her own children.'How "
            "should I know?' said Alice, surprised at her own courage. 'It's no business of mine.'The Queen turned crimson "
            "with fury, and, after glaring at her for a moment like a wild beast, screamed 'Off with her head! "
            "Off—''Nonsense!' said Alice, very loudly and decidedly, and the Queen was silent.The King laid his hand upon "
            "her arm, and timidly said 'Consider, my dear: she is only a child!'The Queen turned angrily away from him, "
            "and said to the Knave 'Turn them over!'The Knave did so, very carefully, with one foot.'Get up!' said the "
            "Queen, in a shrill, loud voice, and the three gardeners instantly jumped up, and began bowing to the King, "
            "the Queen, the royal children, and everybody else.'Leave off that!' screamed the Queen. 'You make me giddy.' "
            "And then, turning to the rose-tree, she went on, 'What have you been doing here?'";
        std::string const inputWithtokens =
            "Five and Seven said nothing, but looked at $(Two). $(Two) began in a low voice, 'Why the fact is, you see, "
            "Miss, "
            "this here ought to have been a red rose-tree, and we put a white one in by mistake; and if the Queen was to "
            "find it out, we should all have our $(heads) cut off, you know. So you see, Miss, we're doing our best, afore "
            "she comes, to—' At this moment Five, who had been anxiously looking across the garden, called out 'The Queen! "
            "The Queen!' and the three gardeners instantly threw themselves flat upon their faces. There was a sound of "
            "many footsteps, and Alice looked round, eager to see the $(Queen).First came ten soldiers carrying clubs; "
            "these "
            "were all shaped like the three gardeners, oblong and flat, with their hands and feet at the corners: next the "
            "ten courtiers; these were ornamented all over with $(diamonds), and walked two and two, as the soldiers did. "
            "After these came the royal children; there were ten of them, and the little dears came jumping merrily along "
            "hand in hand, in couples: they were all ornamented with hearts. Next came the guests, mostly Kings and "
            "Queens, and among them Alice recognised the White Rabbit: it was talking in a hurried nervous manner, smiling "
            "at everything that was said, and went by without noticing her. Then followed the Knave of Hearts, carrying "
            "the King's crown on a crimson velvet cushion; and, last of all this grand procession, came THE KING AND QUEEN "
            "OF HEARTS.Alice was rather doubtful whether she ought not to lie down on her face like the three gardeners, "
            "but she could not remember ever having heard of such a rule at processions; 'and besides, what would be the "
            "use of a procession,' thought she, 'if people had all to lie down upon their faces, so that they couldn't see "
            "it?' So she stood still where she was, and waited.When the procession came opposite to Alice, they all "
            "stopped and looked at her, and the $(Queen) said severely 'Who is this?' She said it to the Knave of Hearts, "
            "who "
            "only bowed and smiled in reply.'Idiot!' said the Queen, tossing her head impatiently; and, turning to Alice, "
            "she went on, 'What's your name, child?''My name is Alice, so please your Majesty,' said Alice very politely; "
            "but she added, to herself, 'Why, they're only a pack of cards, after all. I needn't be afraid of them!''And "
            "who are these?' said the $(Queen), pointing to the three gardeners who were lying round the rosetree; for, "
            "you "
            "see, as they were lying on their faces, and the $(pattern) on their backs was the same as the rest of the "
            "pack, "
            "she could not tell whether they were gardeners, or soldiers, or courtiers, or three of her own children.'How "
            "should I know?' said Alice, surprised at her own courage. 'It's no business of mine.'The Queen turned crimson "
            "with fury, and, after glaring at her for a moment like a wild beast, screamed 'Off with her head! "
            "Off—''Nonsense!' said $(Alice), very loudly and decidedly, and the Queen was silent.The $(King) laid his hand "
            "upon "
            "her arm, and timidly said 'Consider, my dear: she is only a child!'The $(Queen) turned angrily away from him, "
            "and said to the $(Knave) 'Turn them over!'The $(Knave) did so, very carefully, with one foot.'Get up!' said "
            "the "
            "Queen, in a shrill, loud voice, and the three gardeners instantly jumped up, and began bowing to the King, "
            "the Queen, the royal children, and everybody else.'Leave off that!' screamed the Queen. 'You make me giddy.' "
            "And then, turning to the rose-tree, she went on, 'What have you been doing here?'";
    
        static TokenMap const raw_tokens {
            {"Two", "Two"},           {"heads", "heads"},
            {"diamonds", "diamonds"}, {"Queen", "Queen"},
            {"pattern", "pattern"},   {"Alice", "Alice"},
            {"King", "King"},         {"Knave", "Knave"},
            {"Why", "Why"},           {"glaring", "glaring"},
            {"name", "name"},         {"know", "know"},
            {"Idiot", "Idiot"},       {"children", "children"},
            {"Nonsense", "Nonsense"}, {"procession", "procession"},
        };
    
        static TokenMap const tokens {
            {"$(Two)", "Two"},           {"$(heads)", "heads"},
            {"$(diamonds)", "diamonds"}, {"$(Queen)", "Queen"},
            {"$(pattern)", "pattern"},   {"$(Alice)", "Alice"},
            {"$(King)", "King"},         {"$(Knave)", "Knave"},
            {"$(Why)", "Why"},           {"$(glaring)", "glaring"},
            {"$(name)", "name"},         {"$(know)", "know"},
            {"$(Idiot)", "Idiot"},       {"$(children)", "children"},
            {"$(Nonsense)", "Nonsense"}, {"$(procession)", "procession"},
        };
    
    }
    
    NONIUS_BENCHMARK("manual_expand", [](nonius::chronometer cm)     {
        std::string const tmp = testdata::inputWithtokens;
        auto& tokens = testdata::raw_tokens;
    
        std::string result;
        cm.measure([&](int) {
            result = manual_expand(tmp, tokens);
        });
        assert(result == testdata::expected);
    })
    
    #ifdef USE_X3
    NONIUS_BENCHMARK("spirit_x3", [](nonius::chronometer cm) {
        auto const symbols = [&] {
            x3::symbols<char const*> symbols;
            for(const auto& token : testdata::tokens) {
                symbols.add(token.first.c_str(), token.second.c_str());
            }
            return symbols;
        }();
    
        std::string result;
        cm.measure([&](int) {
                result = spirit_x3(testdata::inputWithtokens, symbols);
            });
        //std::cout << "====\n" << result << "\n====\n";
        assert(testdata::expected == result);
    })
    #else
    NONIUS_BENCHMARK("spirit_qi", [](nonius::chronometer cm) {
        auto const symbols = [&] {
            bsq::symbols<char, std::string> symbols;
            for(const auto& token : testdata::tokens) {
                symbols.add(token.first.c_str(), token.second.c_str());
            }
            return symbols;
        }();
    
        std::string result;
        cm.measure([&](int) {
                result = spirit_qi(testdata::inputWithtokens, symbols);
            });
        assert(testdata::expected == result);
    })
    
    NONIUS_BENCHMARK("spirit_qi_old", [](nonius::chronometer cm) {
        std::string result;
        cm.measure([&](int) {
                result = spirit_qi_old(testdata::inputWithtokens, testdata::tokens);
            });
        assert(testdata::expected == result);
    })
    #endif
    
    NONIUS_BENCHMARK("boyer_more", [](nonius::chronometer cm) {
        cm.measure([&](int) {
            std::string tmp = testdata::inputWithtokens;
            boyer_more(tmp, testdata::tokens);
            assert(tmp == testdata::expected);
        });
    })
    
    NONIUS_BENCHMARK("bmh_search", [](nonius::chronometer cm) {
        cm.measure([&](int) {
            std::string tmp = testdata::inputWithtokens;
            bmh_search(tmp, testdata::tokens);
            assert(tmp == testdata::expected);
        });
    })
    
    NONIUS_BENCHMARK("kmp_search", [](nonius::chronometer cm) {
        cm.measure([&](int) {
            std::string tmp = testdata::inputWithtokens;
            kmp_search(tmp, testdata::tokens);
            assert(tmp == testdata::expected);
        });
    })
    
    NONIUS_BENCHMARK("naive_stl", [](nonius::chronometer cm) {
        cm.measure([&](int) {
                std::string tmp = testdata::inputWithtokens;
                naive_stl(tmp, testdata::tokens);
                assert(tmp == testdata::expected);
            });
    })
    
    NONIUS_BENCHMARK("boost_replace_all", [](nonius::chronometer cm)     {
        std::string const tmp = testdata::inputWithtokens;
    
        std::string result;
        cm.measure([&](int) {
            result = boost_replace_all(testdata::inputWithtokens, testdata::tokens);
        });
        assert(result == testdata::expected);
    })
    

答案 1 :(得分:4)

用于MFCMimicking的EDIT2: 从你的代码中可以明显看出为什么MFC版本更快:它没有像你的其他一些版本一样搜索整个字符串(我仍然无法解释boost :: spirit) 。一旦它进行替换,它就会从替换点开始搜索,而不是从字符串的开头搜索,所以它会更加明显,会更快。

编辑:在做了一些研究和观察(Algorithm to find multiple string matches)后,似乎使用好的单字符串匹配算法来查找多个搜索词是这里的实际问题。可能你最好的选择是使用适当的算法(该问题中提到了一些算法)。

至于为什么MFC更快?我建议将其提炼到一个不同的问题&#34;为什么删除和插入CString比std :: string&#34快得多;或类似的东西,并确保你标记它C ++和MFC所以具有正确专业知识的人可以提供帮助(我有标准C ++的经验,但无法帮助VC ++对CString进行优化)。

原始答案: 好的,因为我只查看expandTokens3的大量代码,但我认为所有版本都有同样的问题。您的代码有两个可能很重要的性能问题:

  • 每次进行替换时都会搜索整个字符串。如果要在字符串中替换十个变量,则需要的时间比所需要的长十倍。

  • 您可以在输入字符串中就地执行每个替换,而不是从每个部分构建结果字符串。这可能导致内存分配和每次替换的复制,同样可能会显着增加运行时间。

答案 2 :(得分:2)

  

所以问题是,在如此简单的任务中需要这么长时间?可以说,好的,简单的任务,继续并更好地实施它。但实际情况是,15年前MFC天真实施的任务数量级更快

答案很简单。

首先,我使用apple clang 7.0在我的macbook pro上编译你的代码:

$ cc --version
Apple LLVM version 7.0.0 (clang-700.1.76)
Target: x86_64-apple-darwin15.2.0
Thread model: posix

结果似乎与OP的相符......

Boost.Spirit symbols result: 3425?=3425
10000 cycles took 8906ms.
Boyer-Moore results:3425?=3425
10000 cycles took 2891ms.
Boyer Moore Hospool result:3425?=3425
10000 cycles took 2392ms.
Knuth Morris Pratt result: 3425?=3425
10000 cycles took 4363ms.
Naive STL search and replace result: 3425?=3425
10000 cycles took 4333ms.
Boost replace_all result:3425?=3425
10000 cycles took 23284ms.
MFCMimicking result:3425?=3425
10000 cycles took 426ms.    <-- seemingly outstanding, no?

然后我添加了-O3标志:

Boost.Spirit symbols result: 3425?=3425
10000 cycles took 675ms.
Boyer-Moore results:3425?=3425
10000 cycles took 788ms.
Boyer Moore Hospool result:3425?=3425
10000 cycles took 623ms.
Knuth Morris Pratt result: 3425?=3425
10000 cycles took 1623ms.

Naive STL search and replace result: 3425?=3425
10000 cycles took 562ms.                    <-- pretty good!!!

Boost replace_all result:3425?=3425
10000 cycles took 748ms.
MFCMimicking result:3425?=3425
10000 cycles took 431ms.                    <-- awesome but not as outstanding as it was!

现在结果与MFC CString结果的数量级相同。

  

为什么?

因为当您针对BOOST和/或STL进行编译时,您正在扩展模板,库代码采用与编译单元相同的优化设置。

当你链接到MFC时,你正在链接一个在启用优化的情况下编译的共享库。

当你使用strstr时,你正在调用预编译,优化并在某些部分中手写的c库。当然它会很快!

解决了:))

  

10000个周期不是100000,不同的机器......

供参考,以下是笔记本电脑上电池供电的100,000周期版本的结果。完全优化(-O3):

Boost.Spirit symbols result: 3425?=3425
100000 cycles took 6712ms.
Boyer-Moore results:3425?=3425
100000 cycles took 7923ms.
Boyer Moore Hospool result:3425?=3425
100000 cycles took 6091ms.
Knuth Morris Pratt result: 3425?=3425
100000 cycles took 16330ms.

Naive STL search and replace result: 3425?=3425
100000 cycles took 6719ms.

Boost replace_all result:3425?=3425
100000 cycles took 7353ms.

MFCMimicking result:3425?=3425
100000 cycles took 4076ms.

答案 3 :(得分:1)

好的,这将是一个很长的故事。只是想提醒你提问。

  1. 为什么使用C ++(各种方法)搜索和替换这么慢?
  2. 为什么MFC搜索和替换如此之快?
  3. 令人惊讶的是,这两个问题都有相同的答案。因为C ++开销。 是的。我们闪亮的现代C ++有一个开销,我们主要是为了解决这个问题 灵活和优雅。

    然而,当谈到亚微秒分辨率时(并非C ++不是 能够以纳秒分辨率处理事情)开销变得更多 突出。

    让我展示我在问题中发布的相同代码,但它更多 与每个功能中完成的事情保持一致。

    <强> Live On Coliru

    它使用上述Nonius(感谢@sehe),交互式结果为hereenter image description here

      

    You can click the legend to show/hide particular series.

    结论

    有两个出色的结果

    • MFC模仿功能和
    • 我自己的手册替换

    这些功能至少比其他功能快一个数量级,那么有什么区别?

    当用C语言编写快速时,所有这些“慢”函数都是用C ++编写的(不是纯C,当输出大小增加时,我太懒了,无法处理输出缓冲区的malloc / realloc)。嗯,我想很清楚,有时别无选择,只能求助于纯粹的C.我个人反对使用C出于安全原因和缺乏类型安全性。此外,它只需要更多的专业知识和注意力来编写高质量的C代码。

    我暂时不会将其标记为答案,等待对此结论的评论。

    我要感谢所有积极参与讨论,提出想法并指出我的例子不一致的人。

答案 4 :(得分:1)

只是一些更新。我运行了原始的STL代码(与search相比,受MFC启发,并且通过优化(-O2)得到了stl-base给228ms,而类似MFC的给{{1} }。如果没有优化,我将得到285ms7284ms之类的东西。我在Macbook2016Pro上使用310ms完成此操作。 因此,基本上,在严重优化STL代码的同时,无法优化使用i7-6700HQ CPU @ 2.60GHz的代码。

然后,我运行了strstr代码的最终版本,该代码使用naiveSTL而不是搜索,它给了我find。所以绝对是赢家。我添加了以下代码,以防万一@kreuzerkrieg的链接一天无效。

28ms

答案 5 :(得分:0)

你对std::string:replace的可疑使用是如此毫无意义地缓慢,以至于代码中的任何其他内容都不重要。