我想枚举长度为N的所有向量,其中每个元素的值可以是[0 ... K],并且所有元素的总和都是SUM。
我使用递归函数解决了这个问题,但是当我在CUDA C中重新输入时,我收到一条消息,表示CUDA C不支持递归函数。在此之后,我进行了一些更改并重新编写了函数而不使用递归,但函数是布尔值,这在CUDA C中也不受支持,因为主全局函数必须为void而不调用其他函数。现在我没有想法,有什么帮助吗?
递归函数如下:
private static void computeVectors(int[] n, int sum, int k, int k1, int i) {
if (sum == 0) {
printVectors(n, n.length);
} else if (i < n.length) {
for (int j = k; j >= 0; j--) {
if (j <= k1) {
n[i] = j;
computeVectors(n, sum - j, sum - j, k1, i + 1);
}
}
}
}
private static void printVectors(int p[], int n) {
for (int i = 0; i < n; i++) {
System.out.print(p[i] + " ");
}
System.out.println();
}
public static void main(String[] args) {
// TODO code application logic here
computeVectors(new int[4], 5, 3, 3, 0);
}
此示例的输出为:
3200 3110 3101 3020 3011 3002 2300 2210 2201 2120 2111 2102 2030 2021 2012 2003 1310 1301 1220 1211 1202 1130 1121 1112 1103 1031 1022 1013 0320 0311 0302 0230 0221 0212 0203 0131 0122 0113 0032
0023
答案 0 :(得分:2)
CUDA supports recursive __device__
functions on devices with Compute Capability (CC) 2.0 and later。您需要验证您的GPU是否具有CC 2.0或更高版本,并使用-arch=sm_20
(或更高版本)进行编译。
__global__
内核函数可以使用CUDA Dynamic Parallelism从其他内核启动,这需要CC&gt; = 3.5
在任何情况下,出于性能原因,您可能希望制作非递归版本。
答案 1 :(得分:1)
这是一个非递归版本。我们的基本想法是,我们希望从sum
中选择大小为k
且最多{0,...,N-1}
替换的组合。然后,选择元素的次数给出了结果向量中该元素的大小。根据{{3}}进行思考,我们有sum
个明星和N-1
条。条形将星星分隔成箱子,i
箱子中的星星数量是条目i
的大小(这意味着条形物彼此最多可以k
)。根据需要从左向右移动条形,我们得到示例的反向输出。
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
class Combinations {
public static void main(String... ignore) {
int n = 4;
int sum = 5;
int k = 3;
Integer[] set = new Integer[n];
fillIncreasing(set,0,0,n);
computeVectors(set,sum,k);
}
private static void fillIncreasing(Integer[] array,int from,int first,int to) {
for ( int i = from ; i < to ; i++ ) {
array[i] = i-from+first;
}
}
public static void computeVectors(Integer[] set,int size,int maxChoose) {
int[] vectorToPrint = new int[set.length];
for ( List<Integer> vector : combinationsWithReplacement(set,size,maxChoose) ) {
Arrays.fill(vectorToPrint,0);
for ( Integer entry : vector ) {
vectorToPrint[entry]++;
}
System.out.println("vector: "+Arrays.toString(vectorToPrint));
}
}
public static <T> Iterable<List<T>> combinationsWithReplacement(final T[] set,final int size,final int maxChoose) {
if ( set.length < 2 ) {
throw new IllegalArgumentException();
}
return new Iterable<List<T>>() {
public Iterator<List<T>> iterator() {
return new Iterator<List<T>>() {
Integer[] barSpots = new Integer[set.length+1];
{
fillIncreasing(barSpots,0,0,barSpots.length-1);
barSpots[barSpots.length-1] = size+set.length;
while ( hasNext() && !readyToReturn() ) {
advance();
}
}
private boolean readyToReturn() {
if ( ! hasNext() || set.length*maxChoose < size ) {
return false;
}
for ( int i = 1 ; i < barSpots.length ; i++ ) {
if ( barSpots[i] > maxChoose+barSpots[i-1]+1 ) {
return false;
}
}
return true;
}
private void advance() {
int biggestThatCanMove = barSpots.length-2;
while ( biggestThatCanMove >= 0
&& ( barSpots[biggestThatCanMove]+1 >= barSpots[biggestThatCanMove+1] ) ) {
biggestThatCanMove--;
}
fillIncreasing(barSpots,biggestThatCanMove,
barSpots[biggestThatCanMove]+1,
barSpots.length-1);
}
public boolean hasNext() {
return barSpots[0] == 0;
}
public List<T> next() {
List<T> toRet = new ArrayList<T>();
for ( int i = 0 ; i+1 < barSpots.length ; i++ ) {
int times = barSpots[i+1]-barSpots[i]-1;
for ( boolean ignore : new boolean[times] ) {
toRet.add(set[i]);
}
}
do {
advance();
} while ( hasNext() && !readyToReturn() );
return toRet;
}
public void remove() {
throw new UnsupportedOperationException();
}
};
}
};
}
}