有效地查找通配符条目

时间:2017-07-25 14:18:14

标签: c++ c++11

我有一个包含字符串作为键的地图;那些字符串类似于通配符。

密钥最后可以有*,这意味着在执行查找时,以此密钥作为前缀的字符串应与此密钥匹配。

如何有效地检索此类地图中最接近的匹配条目?

我尝试以自定义方式对地图条目进行排序,然后使用lower_bound,但该排序不会产生正确的结果:

#include <map>
#include <string>
#include <iostream>
#include <algorithm>

struct Compare {
    bool operator()(const std::string& lhs, const std::string& rhs) const
    {
        if (lhs.size() < rhs.size()) {
            return true;
        }

        if (lhs.size() > rhs.size()) {
            return false;
        }

        bool isWildcardlhsAtEnd = (!lhs.empty() && lhs.back() == '*');
        bool isWildcardrhsAtEnd = (!rhs.empty() && rhs.back() == '*');

        if (isWildcardlhsAtEnd && isWildcardrhsAtEnd) {
            return lhs < rhs;
        }
        auto lhSubString = lhs.substr(0, lhs.size() - 1);
        auto rhsSubString = rhs.substr(0, rhs.size() - 1);

        if (isWildcardlhsAtEnd || isWildcardrhsAtEnd) {
            if (lhSubString == rhsSubString) {
                return !isWildcardlhsAtEnd;
            }
            else {
                return lhSubString < rhsSubString;
            }
        }

        return lhs < rhs;
    }
};

template <typename Map>
void lookup(const Map& map, const std::string& key, int expected)
{
    auto it = map.lower_bound(key);
    if (it != map.end()) {
        std::cout << "found " << it->first << " for " << key << "; ";
        std::cout << "expected: " << expected << " got: " << it->second << std::endl;
    }
    else {
        std::cout << "did not find a match for " << key << std::endl;
    }
}

int main()
{
    std::map<std::string, int, Compare> map = {
        { "bar", 1 },
        { "bar*", 2 },
        { "foo1", 3 },
        { "bar1", 4 },
        { "bar1*", 5 },
        { "foo1*", 6 },
        { "bar12", 7 },
        { "bar12*", 8 },
        { "foo12", 9 },
        { "bar123", 10 },
        { "b*", 11 },
        { "f*", 12 },
        { "b", 13 },
        { "f", 14 }
    };

    std::cout << "sorted map \n------" << std::endl;
    std::for_each(map.begin(), map.end(), [](const auto& e) { std::cout << e.first << std::endl; });
    std::cout << "-------" << std::endl;

    lookup(map, "foo1", 3);
    lookup(map, "foo123", 6);
    lookup(map, "foo", 12);
    lookup(map, "bar1234", 8);
}

这将生成以下输出,该输出演示了错误的查找:

sorted map 
------
b
f
b*
f*
bar
bar1
bar*
foo1
bar12
bar1*
foo12
foo1*
bar123
bar12*
-------
found foo1 for foo1; expected: 3 got: 3
did not find a match for foo123
found bar1 for foo; expected: 12 got: 4
did not find a match for bar1234

live example

如果有必要,我也愿意使用其他数据结构。

1 个答案:

答案 0 :(得分:0)

如果您将精确搜索和通配符搜索分开,则自然顺序可以正常使用字符串。这段代码似乎产生了预期的结果(我认为),并且效率很高。当然,单独的地图可以更方便地包裹。

#include <map>
#include <string>
#include <iostream>
#include <algorithm>
template <typename Map>
void lookup(const Map& exact ,const Map& wilds, const std::string& key, int expected)
{
    auto it = exact.find(key);

    if (it == exact.end()) {    // if not exact match   
        it = wilds.lower_bound(key);    // do best match 
        it--;
     }

        std::cout << "found " << it->first << " for " << key << "; ";
        std::cout << "expected: " << expected << " got: " << it->second << std::endl;
}

int main()
{
    std::map<std::string, int> wilds = {
        { "bar*", 2 },
        { "bar1*", 5 },
        { "foo1*", 6 },
        { "bar12*", 8 },
        { "b*", 11 },
        { "f*", 12 }
    };
    std::map<std::string, int> exact = {
        { "bar", 1 },
        { "foo1", 3 },
        { "bar1", 4 },
        { "bar12", 7 },
        { "foo12", 9 },
        { "bar123", 10 },
        { "b", 13 },
        { "f", 14 }
    };
    lookup(exact , wilds, "foo1", 3);
    lookup(exact , wilds,"foo123", 6);
    lookup(exact , wilds,"foo", 12);
    lookup(exact , wilds,"bar1234", 8);
}