lst = ['>CW500.8 \n', 'ATGCTATCATTA\n', '>CW500.9 \n', 'ATGCTATCATTA\n', '>CW500.10 \n', 'ATGCTATCATTT\n', '\n', '$$$\n', '\n', '>WT \n', 'GTGCTATCATTA '] #Fastq formatted file
orgs = []
seqlist1 = []
seqstring = ""
for line in lst:
if line.startswith(">"):
if seqstring != "":
seqlist1.append(seqstring) #makes the sequence a list
seqstring = ""
orgs.append(line.rstrip("\n")) #makes >indv's keys
else:
seqstring += line.rstrip("\n") #adds the seq string to the list
seqlist1.append(seqstring) #must do this or your last line is lost
Output : ['ATGCTATCATTA', 'ATGCTATCATTA', 'ATGCTATCATTT$$$', 'GTGCTATCATTA ']
我需要更改输出,以便在' $$$'之后进行读取。被添加到新列表中。所以我修改了上面的代码:
orgs = []
seqlist1 = []
seqlist2 = []
seqstring = ""
for line in lst:
if line.startswith(">"):
if seqstring != "":
seqlist1.append(seqstring) #makes the sequence a list
seqstring = ""
orgs.append(line.rstrip("\n")) #makes >indv's keys
else:
seqstring += line.rstrip("\n") #adds the seq string to the list
elif line.startswith("$$$"):
seqlist2.append(seqstring)
seqlist1.append(seqstring)#must do this or your last line is lost
seqlist2.append(seqstring)
print seqlist1
print seqlist2
Output: File "/tmp/execpad-49ffac3cc5b6/source-49ffac3cc5b6", line 15
elif line.startswith("$$$"):
^
SyntaxError: invalid syntax
Expected Output:
['ATGCTATCATTA', 'ATGCTATCATTA', 'ATGCTATCATTT']
['GTGCTATCATTA']
任何人都可以向我解释我哪里出错了,我怎样才能进一步修改所需输出的代码?
答案 0 :(得分:0)
您的elif
前面没有相应的if
。 Python语法中不允许这样做。
答案 1 :(得分:0)
lst = ['>CW500.8 \n', 'ATGCTATCATTA\n', '>CW500.9 \n', 'ATGCTATCATTA\n', '>CW500.10 \n', 'ATGCTATCATTT\n', '\n', '$$$\n', '\n', '>WT \n', 'GTGCTATCATTA '] #Fastq formatted file
new_list = []
def word_lister(list):
for line in list:
stripped = line.strip()
replaced = stripped.replace(" ", "")
if replaced.isalpha():
new_list.append(replaced)
# this is to print it
for new in new_list:
print(new)
# this will return a array
return new_list
word_lister(lst)
我把它包装在一个函数中。它将采用您的列表,并将删除/替换空格。然后通过.isalpha()
检查它是否是字母。我还将输出附加到一个新数组以便返回它,或者你可以只使用打印的输出。