我想创建一个包含100个这样的字符串的数组:
sequence = []
for i in range(0, 16):
sequence.append(np.random.choice(nucleotides, 1, p = pfmNew[:,i]))
sequence=[val for sublist in sequence for val in sublist]
sequence = "".join(sequence)
print(sequence)
这是我得到的输出:
TCGTTCACAGTGACAT
现在我想做100次,并将它们放在一个这样的数组中:
['TCGTTCACAGTGACAT', 'next string', ...]
答案 0 :(得分:1)
您快到了,通过将size
中的np.random.choice
参数设置为16一次采样16个核苷酸,可以改善您提供的骨架代码。然后,您只需遍历那100次。
nucleotides = list('ACGT')
sequence = []
for _ in range(100):
sequence.append(''.join(np.random.choice(nucleotides, 16, p = pfmNew[:,i])))
# Or you can replace the loop by a list comprehension:
# sequence = [''.join(np.random.choice(nucleotides, 16, p = pfmNew[:,i])) for _ in range(100)]
# Take a look at the first 10:
sequence[:10]
['TTCACTACCCGCAAAC', 'CTCCTGATACAGATCG', 'CTTGACGATGCTCCGA', 'ATGACCAATGAAGCCG', 'TTGCCGACGTCGATTG', 'ATATTCTTGCGCAGGT', 'CTTAGCCCATCACCCC', 'GGGTTTCCGCCTCGTA', 'ACGTCAAGTGCAGTGC', 'GGTAGATCCGAAACGC']