如何找到已排序容器的匹配元素的索引?

时间:2019-07-03 17:38:23

标签: c++

我正在尝试获取元素匹配的一个容器的索引。两个容器都按升序排序。是否有一种算法或算法组合可以将已排序容器的匹配元素的索引放置到另一个容器中?

我已经编码了一个算法,但是想知道以前是否以某种我没有想到的方式在stl中对它进行了编码?

我希望该算法的运行复杂度与我建议的算法相当,我相信它是O(min(m,n))。

#include <iterator>
#include <iostream>

template <typename It, typename Index_it>
void get_indices(It selected_it, It selected_it_end, It subitems_it, It subitems_it_end, Index_it indices_it)
{
    auto reference_it = selected_it;
    while (selected_it != selected_it_end && subitems_it != subitems_it_end) {
        if (*selected_it == *subitems_it) {
            *indices_it++ = std::distance(reference_it, selected_it);
            ++selected_it;
            ++subitems_it;
        }
        else if (*selected_it < *subitems_it) {
            ++selected_it;
        }
        else {
            ++subitems_it;
        }
    }
}

int main()
{
    int items[] = { 1, 3, 6, 8, 13, 17 };
    int subitems[] = { 3, 6, 17 };
    int indices[std::size(subitems)] = {0};
    auto selected_it = std::begin(items), it = std::begin(subitems);
    auto indices_it = std::begin(indices);
    get_indices(std::begin(items), std::end(items)
        , std::begin(subitems), std::end(subitems)
        , std::begin(indices));
    for (auto i : indices) {
        std::cout << i << ", ";
    }
    return 0;
}

5 个答案:

答案 0 :(得分:0)

我们可以使用find_if来简化函数的实现:

template<class SourceIt, class SelectIt, class IndexIt>
void get_indicies(SourceIt begin, SourceIt end, SelectIt sbegin, SelectIt send, IndexIt dest) {
    auto scan = begin; 

    for(; sbegin != send; ++sbegin) {
        auto&& key = *sbegin; 
        scan = std::find_if(scan, end, [&](auto&& obj) { return obj >= key; }); 
        if(scan == end) break;
        for(; scan != end && *scan == key; ++scan) {
            *dest = std::distance(begin, scan); 
            ++dest; 
        }
    }
}

这并没有使它变得那么短,但是现在代码看起来更简洁了。您一直在扫描,直到找到等于或等于密钥的东西,然后只要源与密钥匹配,就将索引复制到目标。

答案 1 :(得分:0)

也许我误解了这个问题。但是算法库中有一个函数。

#include <iostream> #include <vector> #include <algorithm> #include <iterator> int main() { // Input values std::vector<int> items{ 1,3,6,8,13,17 }; std::vector<int> subitems{ 3,6,17 }; // Result std::vector<int> result; // Do the work. One liner std::set_intersection(items.begin(),items.end(), subitems.begin(),subitems.end(),std::back_inserter(result)); // Debug output: Show result std::copy(result.begin(), result.end(), std::ostream_iterator<int>(std::cout, " ")); return 0; }

这完成了您想要的一项功能。参见:

#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
using Iter = std::vector<int>::iterator;

int main()
{
    // Input values
    std::vector<int> items{ 1,3,6,8,13,17 };
    std::vector<int> subitems{ 3,6,17 };

    // Result
    std::vector<int> indices{};
    Iter it;

    // Do the work.
    std::for_each(subitems.begin(), subitems.end(), [&](int i) {it = find(items.begin(), items.end(), i); if (it != items.end()) indices.push_back(std::distance(items.begin(),it));});

    // Debug output: Show result
    std::copy(indices.begin(), indices.end(), std::ostream_iterator<int>(std::cout, " "));
    return 0;
}

如果我误解了,请告诉我,我会找到另一个解决方案。

编辑:

我确实误会了。您想要索引。那也许是这样吗?

    hconnection-0x6fc1d215-shared--pool1-t95" 
   java.lang.Thread.State: RUNNABLE
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
        at java.net.InetAddress.getAllByName(InetAddress.java:1193)
        at java.net.InetAddress.getAllByName(InetAddress.java:1127)
        at java.net.InetAddress.getByName(InetAddress.java:1077)
        at java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getStubKey(ConnectionManager.java:1802)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getClient(ConnectionManager.java:1772)
        at org.apache.hadoop.hbase.client.ScannerCallable.prepare(ScannerCallable.java:163)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.prepare(ScannerCallableWithReplicas.java:409)
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
        at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)"Timer-47" 

很不幸,这是一个很长的“单线”。

我需要考虑更多。 。

答案 2 :(得分:0)

通过使用std::set_intersection定义一个assignment_iterator类和一个assignment助手,这是可能的:

#include <iterator>
#include <iostream>
#include <algorithm>
#include <vector>

template <typename Transform>
class assignment_iterator
{
    Transform transform;

public:
    using iterator_category = std::output_iterator_tag;
    using value_type        = void;
    using difference_type   = void;
    using pointer           = void;
    using reference         = void;

    assignment_iterator(Transform transform)
        : transform(transform)
    {}

    // For some reason VC++ is assigning the iterator inside of std::copy().
    // Not needed for other compilers.
    #ifdef _MSC_VER
    assignment_iterator& operator=(assignment_iterator const& copy)
    {
        transform.~Transform();
        new (&transform) Transform(copy.transform);
        return *this;
    }
    #endif

    template <typename T>
    constexpr assignment_iterator& operator=(T& value) {
        transform(value);
        return *this;
    }

    constexpr assignment_iterator& operator* (   ) { return *this; }
    constexpr assignment_iterator& operator++(   ) { return *this; }
    constexpr assignment_iterator& operator++(int) { return *this; }
};

template <typename Transform>
assignment_iterator<Transform> assignment(Transform&& transform)
{
    return { std::forward<Transform>(transform) };
}

int main()
{
    int items[] = { 1, 3, 6, 8, 13, 17 };
    int subitems[] = { 3, 6, 17 };
    std::vector<int> indices;
    std::set_intersection(std::begin(items), std::end(items)
        , std::begin(subitems), std::end(subitems)
        , assignment([&items, &indices](int& item) {
            return indices.push_back(&item - &*std::begin(items));
        })
    );

    std::copy(indices.begin(), indices.end()
        , assignment([&indices](int& index) {
            std::cout << index;
            if (&index != &std::end(indices)[-1])
              std::cout <<  ", ";
        })
    );
    return 0;
}

Demo

代码更多,但是也许assignment是执行其他操作的更通用的方法,目前需要诸如back_inserterostream_iterator之类的特定实现,因此代码中的代码更少长期运行(例如,像上述与std::copy一起使用的其他用途)?

根据文档here,此方法应始终正常工作:

  

元素将从第一个范围复制到目标范围。

答案 3 :(得分:0)

答案是肯定的,但它会附带C++20

您可以为此目的使用ranges

首先用您喜欢的谓词制作一个view

auto result = items | ranges::view::filter(predicate);

然后将iteratorbase移到原始数组,例如result.begin().base()将为您提供迭代器到原始begin的{​​{1}}数组。

result

请参见godbolt

答案 4 :(得分:-1)

您可以使用std :: find和std :: distance查找匹配项的索引,然后将其放入容器中。

#include <vector>
#include <algorithm>

int main ()
{
   std::vector<int> v = {1,2,3,4,5,6,7};
   std::vector<int> matchIndexes;
   std::vector<int>::iterator match = std::find(v.begin(), v.end(), 5);
   int index = std::distance(v.begin(), match);
   matchIndexes.push_back(index);

   return 0;
}

要匹配多个元素,可以类似的方式使用std :: search。