'1' 'PS' 'at5g38660' 'Symbols: APE1 | APE1' T
'1.1' 'PS.lightreaction' '' ''
'1.1.1' 'PS.lightreaction.photosystem II' '' ''
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' 'at1g15820' 'Symbols: LHCB6, ' T
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' 'at1g29910' 'Symbols: CAB3' T
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' 'at1g29920' 'Symbols: CAB2' T
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' 'at1g29930' 'Symbols: CAB1' T
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' 'at1g76570' 'chlorophyll A-B' T
'1.1.1.2' 'PS.lightreaction.photosystem II.PSII' 'at1g03600' 'photosystem II' T
'1.1.1.2' 'PS.lightreaction.photosystem II.PSII' 'at1g05385' 'photosystem II 11 kDa' T
'1.1.1.2' 'PS.lightreaction.photosystem II.PSII' 'at1g06680' 'Symbols: PSBP-1SII-P' T
第二档:
at5g38660 2356766_1
at1g15820 3043768_9
at1g29930 2325825_1
at1g76570 2921847_3
at1g03600 2368346_5
at1g05385 2321872_2
预期产出:
'1' 'PS' '2356766_1' 'Symbols: APE1 | APE1' T
'1.1' 'PS.lightreaction' '' ''
'1.1.1' 'PS.lightreaction.photosystem II' '' ''
'1.1.1.1 'PS.lightreaction.photosystem II.LHC-II' '3043768_9' 'Symbols: LHCB6, ' T
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' '' ''
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' '' ''
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' '2325825_1' 'Symbols: CAB1' T
'1.1.1.1' 'PS.lightreaction.photosystem II.LHC-II' '2921847_3' 'chlorophyll A-B' T
'1.1.1.2' 'PS.lightreaction.photosystem II.PSII' '2368346_5' 'photosystem II' T
'1.1.1.2' 'PS.lightreaction.photosystem II.PSII' '2321872_2' 'photosystem II 11 kDa' T
'1.1.1.2' 'PS.lightreaction.photosystem II.PSII' '' ''
这里我希望将第一个文件的'at5g38660'替换为2356766_1 ..依此类推。
如果匹配,则第一个文件中的所有列保持相同,除了第三个(由第二个文件中的值替换)。
如果没有匹配,那么我希望打印第一个文件的第一和第二列以及第三和第四和第五列作为blanl。 (显示在预期产出中)
编码累了:
import sys
ara_red_file = open (sys.argv[1]).readlines()
ara_map_file = open (sys.argv[2]).readlines()
for line in ara_red__file:
split_line = line.split('\t')
ara_id1 = split_line[0]
redbean_id = split_line[1]
for lines in ara_map_file:
split_line = lines.split('\t')
bincode = split_line[0]
name = split_line[1]
ara_id2 = split_line[2]
description = split_line[3]
code_type = split_line[4]
if ara_id2 == ara_id1:
print bincode+'\t'+name+'\t'+"'"+redbean_id+"'"+'\t'+description +'\t'+code_type
elif ara_id2 != ara_id1:
print bincode+'\t'+name+'\t'+ "''" +'\t'+ "''"
这里我遇到的问题是我没有得到预期的输出..它没有检查lat条件并打印最后的elif条件打印。
答案 0 :(得分:0)
我建议您首先使用第二个文件创建一个字典,并确保存储正确,然后制作您正在使用的for循环,但您必须确保您在比较中做得很好。
从文件中创建字典:
dict = {}
ara_red_file = open (sys.argv[1]).readlines()
for line in ara_red__file:
split_line = line.split('\t')
dict[split_line[0]] = split_line[1]
所以我们把词典结尾“喜欢”这个:
#####ADDED COMMAS AT THE END OF EACH ELEMENT
dict = {
'at5g38660': 12345,
'at5g386x2': 45678,
'at5g386x3': 123,
'at5g386x4': 5555,
}
所以现在你必须检查dict中是否有一个键,比如第一个文件的第3列,如下所示:
ara_map_file = open (sys.argv[2]).readlines())
for lines in ara_map_file:
split_line = lines.split('\t')
bincode = split_line[0]
name = split_line[1]
ara_id2 = split_line[2]
description = split_line[3]
code_type = split_line[4]
if dict.get(ara_id2): #CHANGED HERE @@@@@@@@@@@@@
print bincode+'\t'+name+'\t'+"'"+dict[ara_id2]+"'"+'\t'+description +'\t'+code_type
else:
print bincode+'\t'+name+'\t'+ "''" +'\t'+ "''"