R中的序列组合矩阵

时间:2018-06-12 17:02:20

标签: r combinations sequence rowsum

我正在为5个变量创建一个矩阵,这样每个变量都取seq(from = 0, to = 1, length.out = 500)rowSums(data) = 1的值。

换句话说,我想知道如何创建一个矩阵,显示所有可能的数字组合以及每个row = 1的总和。

3 个答案:

答案 0 :(得分:2)

这是一个使用循环的迭代解决方案。给你所有可能的数字排列,最多加1,它们之间的距离是N的倍数。这里的想法是将所有数字从0到1(它们之间的距离是N的倍数),然后对于每个一个包括在新列中的所有数字,当添加时不超过1.冲洗并重复,除了在最后一次迭代中,你只添加完成行的数字行的总和。

就像人们在评论中指出的那样,如果你想要N = 1/499 *,它会给你一个非常大的矩阵。我注意到,对于N = 1/200,它已经花了大约2,3分钟,所以N = 1/499可能需要太长时间。

* SELECT DISTINCT O.mediaID, O.start_time AS CVVStart, O.stop_time AS CVVStop, O.start_time AS ExpStart, O.stop_time AS ExpStop, O.start_time AS CredCardCVVStart, O.stop_time AS CredCardCVVStop, O.start_time AS CredCardNumStart, O.stop_time AS CredCArdNumStop, O.audio_link FROM my_test AS O LEFT JOIN my_test AS CVVStart ON CVVStart.mediaID = O.mediaID AND CVVStart.q_short_name = 'CVV Number - Start and Stop Time' LEFT JOIN my_test AS CVVStop ON CVVStop.mediaID = O.mediaID AND CVVStart.q_short_name = 'CVV Number - Start and Stop Time' LEFT JOIN my_test AS ExpStart ON ExpStart.mediaID = O.mediaID AND ExpStart.q_short_name = 'Expiration Date - Start and Stop Time' LEFT JOIN my_test AS ExpStop ON ExpStop.mediaID = O.mediaID AND ExpStop.q_short_name = 'Expiration Date - Start and Stop Time' LEFT JOIN my_test AS CredCardCVVStart ON CredCardCVVStart.mediaID = O.mediaID AND CredCardCVVStart.q_short_name = 'Credit Card CVV - Start and Stop Time' LEFT JOIN my_test AS CredCardCVVStop ON CredCardCVVStop.mediaID = O.mediaID AND CredCardCVVStop.q_short_name = 'Credit Card CVV - Start and Stop Time' LEFT JOIN my_test AS CredCardNumStart ON CredCardNumStart.mediaID = O.mediaID AND CredCardNumStart.q_short_name = 'Credit Card Number - Start and Stop Time'; seq(from = 0, to = 1, length.out = 500)

相同
seq(from = 0, to = 1, by = 1/499)

答案 1 :(得分:1)

如果我理解正确,至少可以让你走上正确的轨道。

# Parameters
len_vec = 500 # vector length
num_col = 5 # number of columns

# Creating the values for the matrix using rational numbers between 0 and 1
values <- runif(len_vec*num_col)

# Creating matrix
mat <- matrix(values,ncol = num_col,byrow = T)

# ROunding the matrix to create only 0s and 1s
mat <- round(mat)

# Calculating the sum per row
apply(mat,1,sum)

答案 2 :(得分:1)

这正是包partitions的目的。基本上OP正在寻找总和为499的5个整数的所有可能组合。这可以通过restrictedparts轻松实现:

system.time(combsOne <- t(as.matrix(restrictedparts(499, 5))) / 499)
 user  system elapsed 
1.635   0.867   2.502 


head(combsOne)
         [,1]        [,2] [,3] [,4] [,5]
[1,] 1.000000 0.000000000    0    0    0
[2,] 0.997996 0.002004008    0    0    0
[3,] 0.995992 0.004008016    0    0    0
[4,] 0.993988 0.006012024    0    0    0
[5,] 0.991984 0.008016032    0    0    0
[6,] 0.989980 0.010020040    0    0    0

tail(combsOne)
                 [,1]      [,2]      [,3]      [,4]      [,5]
[22849595,] 0.2024048 0.2004008 0.2004008 0.2004008 0.1963928
[22849596,] 0.2064128 0.1983968 0.1983968 0.1983968 0.1983968
[22849597,] 0.2044088 0.2004008 0.1983968 0.1983968 0.1983968
[22849598,] 0.2024048 0.2024048 0.1983968 0.1983968 0.1983968
[22849599,] 0.2024048 0.2004008 0.2004008 0.1983968 0.1983968
[22849600,] 0.2004008 0.2004008 0.2004008 0.2004008 0.1983968

由于我们处理的是数值,我们无法获得精确的精度,但是我们可以获得机器精度:

all(rowSums(combsOne) == 1)
[1] FALSE

all((rowSums(combsOne) - 1) < .Machine$double.eps)
[1] TRUE

结果超过2200万:

row(combsOne)
[1] 22849600