Question

是否可以在 Thrust 中执行多路（> 2）稳定分区？稳定分区或稳定分区副本都同样有趣。目前，我只能将双向stable partition copy用于上述目的。很清楚如何使用它通过两个谓词和两个thrust::stable_partition_copy调用将一个序列分为三个部分。但我确信在技术上可以实现多路稳定分区。

我可以想象以下多路稳定分区副本（伪代码）：

using F = float;

thrust::device_vector< F > trianges{N * 3};
// fill triangles here

thrust::device_vector< F > A{N}, B{N}, C{N};
auto vertices_begin = thrust::make_tuple(A.begin(), B.begin(), C.begin());

using U = unsigned int;
auto selector = [] __host__ __device__ (U i) -> U { return i % 3; };

thrust::multiway_stable_partition_copy(p, triangles.cbegin(), triangles.cend(), selector, vertices_begin);

A.begin()，B.begin()，C.begin()应该单独增加。

此外，我可以想象假设的调度迭代器，该迭代器将执行相同的操作（并且我认为会更有用）。

Answer 1

据我对推力内件的了解，该算法不容易适应您的设想。

一种简单的方法是使用智能二进制谓词（例如

）将理论上的两遍三通分区扩展为M-1遍

template<typename T>
struct divider
{
   int pass;
   __host__ __device__ divider(int p) : pass(p) { };
   __host__ __device__ int classify(const T &val) { .... };
   __host__ __device__  bool operator()(const T &val) { return !(classify(val) > pass) };

}

将给定的输入枚举为M个可能的子集，如果输入在第N个或更少的子集中，则返回true，然后返回循环

auto start = input.begin();
for(int i=0; i<(M-1); ++i) {
   divider pred<T>(i);
   result[i] = thrust::stable_partition(
                         thrust::device,
                         start, 
                         input.end(),
                         pred());
   start = result[i];
}

[请注意，在波罗的海的船上漂浮时，平板电脑上用浏览器编写的所有代码。显然永远不会编译或运行。 ]

这肯定是最节省空间的，因为最多需要len(input)个临时存储，而假想的单遍实现将需要M * len(input)存储，这对于大型{ {1}}。

多路稳定分区

1 个答案: