如何有效地将对象(或一系列对象)从向量A复制到向量B
其中,向量B已包含与向量A
相同的某些对象所以没有从矢量A复制的对象已经在向量B中列出了吗?
我将图表存储为std::vector<MinTreeEdge>minTreeInput
中的边矢量。
我有一个根据此图创建的最小生成树,存储在std::vector<MinTreeEdge>minTreeOutput
。
我正在尝试添加一个随机添加一定数量的边回到minTreeOutput
。为此,我想将minTreeInput
中的元素复制回minTreeOutput
,直到后者包含所需的边数。当然,复制的每个边缘对象都不能存储minTreeOutput
。此图表中不能有重复的边缘。
以下是我到目前为止所提出的问题。它工作,但它真的很长,我知道循环必须运行多次,具体取决于图形和树。我想知道如何正确地做到这一点:
// Edge class
struct MinTreeEdge
{
// For std::unique() between objects
bool operator==(MinTreeEdge const &rhs) const noexcept
{
return lhs == rhs.lhs;
}
int lhs;
int node1ID;
int node2ID;
int weight;
......
};
......
// The usage
int currentSize = minTreeOutput.size();
int targetSize = currentSize + numberOfEdgesToReturn;
int sizeDistance = targetSize - currentSize;
while(sizeDistance != 0)
{
//Probably really inefficient
for(std::vector<MinTreeEdge>::iterator it = minTreeInput.begin(); it != minTreeInput.begin()+sizeDistance; ++it)
minTreeOutput.push_back(*it);
std::vector<MinTreeEdge>::iterator mto_it;
mto_it = std::unique (minTreeOutput.begin(), minTreeOutput.end());
currentSize = minTreeOutput.size();
sizeDistance = targetSize - currentSize;
}
或者,有没有办法只列出minTreeInput
(树)中<{1}}(树)中不的所有边缘,而无需检查每个单独的元素在前者对抗后者?
答案 0 :(得分:5)
如何有效地将对象(或一系列对象)从向量A复制到向量B中,其中向量B已经包含与向量A相同的某些对象,因此没有从向量A复制的对象已经在向量中列出乙
如果我正确理解了这个问题,可以解释为“如何创建两个向量的集合?”。
答案:std::set_union
请注意,要使其工作,需要对两个向量进行排序。这是出于效率原因,正如您已经提到的那样。
#include <vector>
#include <algorithm>
#include <cassert>
#include <iterator>
struct MinTreeEdge
{
// For std::unique() between objects
bool operator==(MinTreeEdge const &rhs) const noexcept
{
return lhs == rhs.lhs;
}
int lhs;
int node1ID;
int node2ID;
int weight;
};
struct lower_lhs
{
bool operator()(const MinTreeEdge& l, const MinTreeEdge& r) const noexcept
{
return l.lhs < r.lhs;
}
};
std::vector<MinTreeEdge> merge(std::vector<MinTreeEdge> a,
std::vector<MinTreeEdge> b)
{
// let's pessimistically assume that the inputs are not sorted
// we could simply assert that they are if the caller is aware of
// the requirement
std::sort(a.begin(), a.end(), lower_lhs());
std::sort(b.begin(), b.end(), lower_lhs());
// alternatively...
// assert(std::is_sorted(a.begin(), a.end(), lower_lhs()));
// assert(std::is_sorted(b.begin(), b.end(), lower_lhs()));
// optional step if the inputs are not already `unique`
a.erase(std::unique(a.begin(), a.end()), a.end());
b.erase(std::unique(b.begin(), b.end()), b.end());
std::vector<MinTreeEdge> result;
result.reserve(a.size() + b.size());
std::set_union(a.begin(), a.end(),
b.begin(), b.end(),
std::back_inserter(result),
lower_lhs());
return result;
}
int main()
{
// example use case
auto a = std::vector<MinTreeEdge>{};
auto b = std::vector<MinTreeEdge>{};
b = merge(std::move(a), std::move(b));
}
有一些提到要完成此事的集合。可以公平地说,如果:
MinTreeEdge
昂贵,无法复制和然后我们可以期望在使用unordered_set
时看到性能优势。但是,如果复制昂贵的对象,那么我们可能希望通过引用将它们存储在我们的临时集中。
我可能会这样做:
// utility class which converts unary and binary operations on
// a reference_wrapper into unary and binary operations on the
// referred-to objects
template<class unary, class binary>
struct reference_as_object
{
template<class U>
decltype(auto) operator()(const std::reference_wrapper<U>& l) const {
return _unary(l.get());
}
template<class U, class V>
decltype(auto) operator()(const std::reference_wrapper<U>& l,
const std::reference_wrapper<V>& r) const {
return _binary(l.get(), r.get());
}
unary _unary;
binary _binary;
};
// utility to help prevent typos when defining a set of references
template<class K, class H, class C> using unordered_reference_set =
std::unordered_set<
std::reference_wrapper<K>,
reference_as_object<H, C>,
reference_as_object<H, C>
>;
// define unary and binary operations for our set. This way we can
// avoid polluting MinTreeEdge with artificial relational operators
struct mte_hash
{
std::size_t operator()(const MinTreeEdge& mte) const
{
return std::hash<int>()(mte.lhs);
}
};
struct mte_equal
{
bool operator()(MinTreeEdge const& l, MinTreeEdge const& r) const
{
return l.lhs == r.lhs;
}
};
// merge function. arguments by value since we will be moving
// *expensive to copy* objects out of them, and the vectors themselves
// can be *moved* into our function very cheaply
std::vector<MinTreeEdge> merge2(std::vector<MinTreeEdge> a,
std::vector<MinTreeEdge> b)
{
using temp_map_type = unordered_reference_set<MinTreeEdge, mte_hash, mte_equal>;
// build a set of references to existing objects in b
temp_map_type tmap;
tmap.reserve(b.capacity());
// b first, since the requirements mentioned 'already in B'
for (auto& ob : b) { tmap.insert(ob); }
// now add missing references in a
for (auto& oa : a) { tmap.insert(oa); }
// now build the result, moving objects from a and b as required
std::vector<MinTreeEdge> result;
result.reserve(tmap.size());
for (auto r : tmap) {
result.push_back(std::move(r.get()));
}
return result;
// a and b now have elements which are valid but in an undefined state
// The elements which are defined are the duplicates we don't need
// on summary, they are of no use to us so we drop them.
}
让我们说我们想要坚持使用矢量方法(我们几乎总是应该这样),但MinTreeEdge的复制费用有点贵。假设它使用pimpl习惯用于内部多态,这将不可避免地意味着复制上的内存分配。但是,让我们说它的价格便宜。我们还想象一下,在将数据发送给我们之前,不能指望调用者对数据进行排序或唯一化。
我们仍然可以通过标准算法和载体实现良好的效率:
std::vector<MinTreeEdge> merge(std::vector<MinTreeEdge> a,
std::vector<MinTreeEdge> b)
{
// sorts a range if not already sorted
// @return a reference to the range
auto maybe_sort = [] (auto& c) -> decltype(auto)
{
auto begin = std::begin(c);
auto end = std::end(c);
if (not std::is_sorted(begin, end, lower_lhs()))
std::sort(begin, end, lower_lhs());
return c;
};
// uniqueify a range, returning the new 'end' of
// valid data
// @pre c is sorted
// @return result of std::unique(...)
auto unique = [](auto& c) -> decltype(auto)
{
auto begin = std::begin(c);
auto end = std::end(c);
return std::unique(begin, end);
};
// turn an iterator into a move-iterator
auto mm = [](auto iter) { return std::make_move_iterator(iter); };
std::vector<MinTreeEdge> result;
result.reserve(a.size() + b.size());
// create a set_union from two input containers.
// @post a and b shall be in a valid but undefined state
std::set_union(mm(a.begin()), mm(unique(maybe_sort(a))),
mm(b.begin()), mm(unique(maybe_sort(b))),
std::back_inserter(result),
lower_lhs());
return result;
}
如果一个提供自由函数void swap(MinTreeEdge& l, MinTreeEdge& r) nothrow
,那么此函数将需要恰好N个移动,其中N是结果集的大小。因为在pimpl类中,移动只是一个指针交换,这个算法仍然有效。
答案 1 :(得分:1)
由于输出向量不应包含重复项,因此完成不存储重复项的一种方法是将输出容器更改为std::set<MinEdgeTree>
而不是std::vector<MinEdgeTree>
。原因是std::set
不存储重复项,因此您不必编写代码来自行检查。
首先,您需要为operator <
类定义MinEdgeTree
:
struct MinTreeEdge
{
// For std::unique() between objects
bool operator==(MinTreeEdge const &rhs) const noexcept
{
return lhs == rhs.lhs;
}
// For std::unique() between objects
bool operator<(MinTreeEdge const &rhs) const noexcept
{
return lhs < rhs.lhs;
}
//...
};
一旦这样做,while
循环可以替换为以下内容:
#include <set>
#include <vector>
#include <iterator>
#include <algorithm>
//...
std::vector<MinTreeEdge> minTreeInput;
//...
std::set<MinTreeEdge> minTreeOutput;
//...
std::copy(minTreeInput.begin(), minTreeInput.end(),
std::inserter(minTreeOutput, minTreeOutput.begin()));
根本不需要致电std::unique
,因为std::set
会检查重复项。
如果输出容器必须保留为std::vector
,您仍然可以使用临时std::set
执行上述操作,然后将std::set
复制到输出向量:
std::vector<MinTreeEdge> minTreeInput;
std::vector<MinTreeEdge> minTreeOutput;
//...
std::set<MinTreeEdge> tempSet;
std::copy(minTreeInput.begin(), minTreeInput.end(),
std::inserter(tempSet, tempSet.begin()));
std::copy(tempSet.begin(), tempSet.end(),std::back_inserter(minTreeOutput));
答案 2 :(得分:0)
您可以使用以下内容:
struct MinTreeEdge
{
bool operator<(MinTreeEdge const &rhs) const noexcept
{
return id < rhs.id;
}
int id;
int node1ID;
int node2ID;
int weight;
};
std::vector<MinTreeEdge> CreateRandomGraph(const std::vector<MinTreeEdge>& minSpanningTree,
const std::vector<MinTreeEdge>& wholeTree,
std::mt19937& rndEng,
std::size_t expectedSize)
{
assert(std::is_sorted(minSpanningTree.begin(), minSpanningTree.end()));
assert(std::is_sorted(wholeTree.begin(), wholeTree.end()));
assert(minSpanningTree.size() <= expectedSize);
assert(expectedSize <= wholeTree.size());
std::vector<MinTreeEdge> res;
std::set_difference(wholeTree.begin(), wholeTree.end(),
minSpanningTree.begin(), minSpanningTree.end(),
std::back_inserter(res));
std::shuffle(res.begin(), res.end(), rndEng);
res.resize(expectedSize - minSpanningTree.size());
res.insert(res.end(), minSpanningTree.begin(), minSpanningTree.end());
// std::sort(res.begin(), res.end());
return res;
}