我有数据,如下所示:
Name Nm1 * *
Ind1 AACTCAGCTCACG
Ind2 GTCATCGCTACGA
Ind3 CTTCAAACTGACT
我需要从“名称”行中的星号标记的每个位置抓取该字母并打印此字母以及星号的索引
所以结果将是
Ind1, 12, T
Ind2, 12, A
Ind3, 12, C
Ind1, 17, T
Ind2, 17, T
Ind3, 17, T
我正在尝试使用enumerate()
来检索星号的位置,然后我的想法是,我可以使用这些索引来抓取字母。
import sys
import csv
input = open(sys.argv[1], 'r')
Output = open(sys.argv[1]+"_processed", 'w')
indlist = (["Individual_1,", "Individual_2,", "Individual_3,"])
with (input) as searchfile:
for line in searchfile:
if '*' in line:
LocusID = line[2:13]
LocusIDstr = LocusID.strip()
hit = line
for i, x in enumerate(hit):
if x=='*':
position = i
print position
for item in indlist:
Output.write("%s%s%s\n" % (item, LocusIDstr, position))
Output.close()
如果enumerate()
输出,例如
12
17
如何单独访问每个索引?
另外,当我打印位置时,我会得到我想要的数字。但是,当我写入文件时,只写入最后一个位置。这是为什么?
---------------- EDIT -----------------
根据下面的建议,我已经编辑了我的代码,让它更简单(对我来说)理解。
import sys
import csv
input = open(sys.argv[1], 'r')
Output = open(sys.argv[1]+"_FGT_Data", 'w')
indlist = (["Individual_1,", "Individual_2,", "Individual_3,"])
with (input) as searchfile:
for line in searchfile:
if '*' in line:
LocusID = line[2:13]
LocusIDstr = LocusID.strip()
print LocusIDstr
hit = line
for i, x in enumerate(hit):
if x=='*':
position = i
#print position
input = open(sys.argv[1], 'r')
with (input) as searchfile:
for line in searchfile:
if line [0] == ">":
print line[position], position
with (Output) as writefile:
for item in indlist:
writefile.write("%s%s%s\n" % (item, LocusIDstr, position))
Output.close()
但我仍然没有解决方法来访问每个索引。
答案 0 :(得分:2)
修改强> 更改为使用您在评论中提供给我的文件。如果您自己创建了此文件,请考虑下次使用列。
import sys
read_file = sys.argv[1]
write_file = "%s_processed.%s"%(sys.argv[1].split('.')[0],sys.argv[1].split('.')[1])
indexes = []
lines_to_write = []
with open(read_file,'r') as getindex:
first_line = getindex.readline()
for i, x in enumerate(first_line):
if x == '*':
indexes.append(i-11)
with open(read_file,'r') as getsnps:
for line in getsnps:
if line.startswith(">"):
sequence = line.split(" ")[1]
for z in indexes:
string_to_append = "%s\t%s\t%s"%(line.split(" ")[0],z+1,sequence[z])
lines_to_write.append(string_to_append)
with open(write_file,"w") as write_file:
write_file.write("\n".join(lines_to_write))