我想从19.000个基因的宇宙中创建1000个1652个基因的随机列表。我决定更换,因为宇宙不是那么大。唯一的条件是列表可以在它们之间包含相似的基因(由于替换),但每个列表不能包含多于一次的基因。因此它在单个列表中将是唯一的。有关于此的任何建议吗?
Ex:Universe =字母[1:26]
期望的输出:
[[1]] [[2]] [[3]] [[...]] a b f b c a c d b f z j h j o
我想避免像以下情况:
[[1]] [[...]] a a b c c
由于宇宙不是那么大,我无法设置REPLACE = F.如果我设置了REPLACE = T,复制的元素会出现在列表中......这就是我试图避免分析的内容。
提前致谢
电子。
答案 0 :(得分:4)
此代码从Universe中抽取10个10个样本,无需替换。我想这就是你想要的:
Universe = letters[1:26]
replicate(5, sample(Universe, 10, replace = FALSE))
[,1] [,2] [,3] [,4] [,5]
[1,] "j" "l" "k" "c" "j"
[2,] "g" "i" "c" "t" "g"
[3,] "z" "u" "m" "u" "e"
[4,] "a" "b" "t" "e" "q"
[5,] "q" "d" "j" "k" "m"
[6,] "r" "a" "l" "l" "x"
[7,] "e" "g" "r" "i" "f"
[8,] "l" "w" "o" "g" "u"
[9,] "b" "y" "b" "x" "c"
[10,] "u" "j" "x" "a" "b"
答案 1 :(得分:3)
不确定“REPLACE = T”是什么意思,但random.sample可能会做你想要的
>>> import random
>>> import string
>>> universe = string.ascii_lowercase
>>> random.sample(universe, 5)
['z', 'n', 'p', 'u', 's']
使用数字作为宇宙
>>> universe = range(19000)
>>> result = [random.sample(universe, 1652) for x in range(1000)]
运行不到一秒钟。如果你想避免重复(首先不太可能),你可以使用一套
>>> result = set()
>>> while len(result) < 1000:
... result.add(tuple(random.sample(universe, 1652)))