Question

我有K个对象（K很小，例如2或5），我需要以随机顺序迭代它们N次，其中N可能很大。我需要在 foreach 循环中进行迭代，为此我应该提供一个迭代器。

到目前为止，我创建了相应复制的std::vector个K对象，因此向量的大小为N，现在我使用该向量提供的begin()和end()。我使用std::shuffle()随机化向量，这占用了20％的运行时间。我认为编写一个自定义迭代器会更好（并且更优雅，无论如何），它会以随机顺序返回我的一个对象，而不会创建大小为N的帮助向量。但是如何做到这一点？

Answer 1

显然你的迭代器必须：

存储指向原始矢量或数组的指针：m_pSource
存储请求数量（能够停止）：m_nOutputCount
使用随机数生成器（请参阅random）：m_generator
必须将某些迭代器视为end迭代器：m_nOutputCount == 0

我为type int做了一个例子：

#include <iostream>
#include <random>

class RandomIterator: public std::iterator<std::forward_iterator_tag, int>
{
public:
    //Creates "end" iterator
    RandomIterator() : m_pSource(nullptr), m_nOutputCount(0), m_nCurValue(0) {}

    //Creates random "start" iterator
    RandomIterator(const std::vector<int> &source, int nOutputCount) :
        m_pSource(&source), m_nOutputCount(nOutputCount + 1), 
        m_distribution(0, source.size() - 1)
    {
        operator++(); //make new random value
    }

    int operator* () const
    {
        return m_nCurValue;
    }

    RandomIterator operator++()
    {
        if (m_nOutputCount == 0)
            return *this;
        --m_nOutputCount;

        static std::default_random_engine generator;
        static bool bWasGeneratorInitialized = false;
        if (!bWasGeneratorInitialized)
        {
            std::random_device rd; //expensive calls
            generator.seed(rd());
            bWasGeneratorInitialized = true;
        }       

        m_nCurValue = m_pSource->at(m_distribution(generator));
        return *this;
    }

    RandomIterator operator++(int)
    {   //postincrement
        RandomIterator tmp = *this;
        ++*this;
        return tmp;
    }

    int operator== (const RandomIterator& other) const
    {
        if (other.m_nOutputCount == 0)
            return m_nOutputCount == 0; //"end" iterator
        return m_pSource == other.m_pSource;
    }

    int operator!= (const RandomIterator& other) const
    {
        return !(*this == other);
    }
private:
    const std::vector<int> *m_pSource; 
    int m_nOutputCount;
    int m_nCurValue;

    std::uniform_int_distribution<std::vector<int>::size_type> m_distribution;
};

int main()
{
    std::vector<int> arrTest{ 1, 2, 3, 4, 5 };

    std::cout << "Original =";
    for (auto it = arrTest.cbegin(); it != arrTest.cend(); ++it)
        std::cout << " " << *it;
    std::cout << std::endl;

    RandomIterator rndEnd;

    std::cout << "Random =";
    for (RandomIterator it(arrTest, 15); it != rndEnd; ++it)
        std::cout << " " << *it;

    std::cout << std::endl;
}

输出结果为：

Original = 1 2 3 4 5
Random = 1 4 1 3 2 4 5 4 2 3 4 3 1 3 4

您可以轻松将其转换为template。并使其接受任何随机访问iterator。

Answer 2

我只想增加Dmitriy的答案，因为阅读你的问题，似乎你想要每次迭代你新创建和改组的集合时，项目不应重复，而Dmitryi的答案确实有重复。所以这两个迭代器都很有用。

template <typename T>
struct  RandomIterator : public std::iterator<std::forward_iterator_tag, typename T::value_type>
{
    RandomIterator() : Data(nullptr)
    {
    }

    template <typename G>
    RandomIterator(const T &source, G& g) : Data(&source)
    {
        Order = std::vector<int>(source.size());
        std::iota(begin(Order), end(Order), 0);
        std::shuffle(begin(Order), end(Order), g);
        OrderIterator = begin(Order);
        OrderIteratorEnd = end(Order);
    }

    const typename T::value_type& operator* () const noexcept
    {
        return (*Data)[*OrderIterator];
    }

    RandomIterator<T>& operator++() noexcept
    {
        ++OrderIterator;
        return *this;
    }

    int operator== (const RandomIterator<T>& other) const noexcept
    {
        if (Data == nullptr && other.Data == nullptr)
        {
            return 1;
        }
        else if ((OrderIterator == OrderIteratorEnd) && (other.Data == nullptr))
        {
            return 1;
        }
        return 0;
    }

    int operator!= (const RandomIterator<T>& other) const noexcept
    {
        return !(*this == other);
    }
private:
    const T *Data;
    std::vector<int> Order;
    std::vector<int>::iterator OrderIterator;
    std::vector<int>::iterator OrderIteratorEnd;
};

template <typename T, typename G>
RandomIterator<T> random_begin(const T& v, G& g) noexcept
{
    return RandomIterator<T>(v, g);
}

template <typename T>
RandomIterator<T> random_end(const T& v) noexcept
{
    return RandomIterator<T>();
}

整个代码在 http://coliru.stacked-crooked.com/a/df6ce482bbcbafcf或 https://github.com/xunilrj/sandbox/blob/master/sources/random_iterator/source/random_iterator.cpp

实现自定义迭代器可能非常棘手，所以我尝试按照一些教程进行操作，但如果事情已经过去，请告诉我：

http://web.stanford.edu/class/cs107l/handouts/04-Custom-Iterators.pdf https://codereview.stackexchange.com/questions/74609/custom-iterator-for-a-linked-list-class Operator overloading

我认为表现令人满意：在Coliru：

<size>:<time for 10 iterations>
1:0.000126582
10:3.5179e-05
100:0.000185914
1000:0.00160409
10000:0.0161338
100000:0.180089
1000000:2.28161

当然，它的价格是用订单分配整个矢量，这与原始矢量的大小相同。如果由于某种原因你必须经常进行随机迭代并允许迭代器使用这个预先分配的向量，或者在迭代器中使用某种形式的reset（），那么改进就是预先分配Order向量。

迭代器不存在的序列

2 个答案: