Question

f = ''
da = ['A', 'T', 'G', 'C', ' ']
fnn = []
print(fnn)

con = 0

x = input('Corrupted: ')
nx = list(x)

for nx in nx:
    if nx[con] in da:
        f = f + str(nx[con])
    else:
        pass

fn = f.split()

print(fn)
print(fn[0])

for i in fn:
    if fn[i] not in fnn:
        fnn = fnn.extend(fn[i])
    else:
        pass

print(fnn)

此脚本用于读取输入，然后删除除A C G T之外的所有字母，并删除任何指向它的重复序列，我真的很难让它去除重复序列，我需要什么做？我究竟做错了什么？有没有更快的方法来做到这一点？

Answer 1

你的代码做了几件奇怪的事情：

nx = list(x) - 为什么将x转换为列表？您可以轻松地遍历字符串
for nx in nx（如前所述）
if nx[con] in da - 这是想要实现的目标？什么是骗子？
str(nx[con]) - nx [0]已经是一个字符串
else: pass - 如果您要传递
extend扩展了一个列表，因此无需编写my_list = my_list.extend...（实际上您将以这种方式丢失列表）
fnn.extend(fn[i]) - 如果代码的前一部分正常工作，那么fn[i]可能是一个字符串;你可能不希望用字符串扩展列表。

尝试尝试看看它的作用：

x = ['cat', 'dog']
x.extend('mouse')
print x

我认为你想要的是这样的。请注意使用描述性变量名来帮助读者理解代码的作用。

permitted_characters = 'ATGC '
corrupted = input('Corrupted: ')

# Remove characters that are not permitted and split string into sequences
sequences = ''.join(c for c in corrupted if c in permitted_characters).split()

# Remove repeated sequences
unique_sequences = []
for sequence in sequences:
    if not sequence in unique_sequences:
        unique_sequences.append(sequence)

Answer 2

for nx in nx:

首先，您要覆盖列表，因此nx[con]将无效。

将部件添加到尚未添加的列表中

2 个答案: