如何遍历多个列表并仅在存在时打印出

时间:2019-11-18 17:36:51

标签: python

我正在尝试遍历多个列表,并发现其他列表中是否存在某个项目。 这是我的代码:

with open("Busca1.txt", "r") as f, open("CELLO1.txt","r") as f1, open("PSORT.txt","r") as f2, open("results","w+") as of :
    file_in = f.readlines()
    file_in1 = f1.readlines()
    file_in2 = f2.readlines()
    file_in3 = f3.readlines()
    for line in file_in:
        temp = line.split()
        ID_busca = temp[1]
    for line in file_in1:
        temp2 = line.split()
        ID_cello = temp2[1]
    for line in file_in2:
        temp3 = line.split()
        ID_psort = temp3[1]

        all = [i for i in ID_busca if i in ID_cello + ID_psort]
        print all

这是我得到的:

['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']
['Y', 'P', '_', '2', '0', '7', '6', '8', '4', '.', '1']


So the item is split by a letter and it seems to be printed out and one item seems to be several times. 

以下是文件示例:

Busca1.txt:

C:extracellular YP_207690.1
C:plasma YP_207698.1
C:extracellular YP_207699.1
C:extracellular YP_207700.1
C:extracellular YP_207701.1
C:extracellular YP_207704.1
C:extracellular YP_207706.1
C:extracellular YP_207716.1
C:extracellular YP_207717.1
C:extracellular YP_207719.1
C:plasma YP_207722.1
C:plasma YP_207728.1
C:plasma YP_207729.1
C:extracellular YP_207731.1

CELLO1.txt:

OuterMembrane YP_008914846.1 opacity
Periplasmic YP_008914847.1 hypothetical
OuterMembrane YP_008914851.1 opacity
OuterMembrane YP_008914852.1 opacity
OuterMembrane YP_008914853.1 opacity
OuterMembrane YP_008914854.1 opacity
OuterMembrane YP_008914855.1 opacity
OuterMembrane YP_008914857.1 opacity
OuterMembrane YP_008914859.1 opacity
OuterMembrane YP_008914860.1 opacity
Periplasmic YP_008994831.1 hypothetical
Periplasmic YP_009115479.1 DNA
Extracellular YP_009115480.1 bacterioferritin-associated
OuterMembrane YP_009115486.1 pilus
InnerMembrane YP_009115487.1 hypothetical
InnerMembrane YP_009115488.1 membrane
Periplasmic YP_009115490.1 pilin
Periplasmic YP_009179204.1 hypothetical
Periplasmic YP_207190.2 leucine--tRNA

PSORT.txt:

SeqID: YP_008914846.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914847.1 hypothetical protein NGO0146a [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914848.1 hypothetical protein NGO0250a [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914849.1 hypothetical protein NGO0590a [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914851.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914852.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914853.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914854.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914855.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914857.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914859.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008914860.1 opacity protein [Neisseria gonorrhoeae FA 1090]
SeqID: YP_008994831.1 hypothetical protein NGO1621a [Neisseria gonorrhoeae FA 1090]
SeqID: YP_009115480.1 bacterioferritin-associated ferredoxin [Neisseria gonorrhoeae FA 1090

任何人都可以帮助获得代码来完成我需要的工作吗? 如果要显示在所有当前列表中,我想打印YP _ *********代码。

谢谢

1 个答案:

答案 0 :(得分:0)

您需要修正逻辑。我不确定您是如何尝试完成这项工作的。首先,您引用一个不存在的文件f3;您有一个打算使用的of别名吗? 其次,您给定的输出不是来自您发布的代码。请确保您的问题陈述完全正确。

关于预期的操作,请查看循环形式:

for line in file_in:
    temp = line.split()
    ID_busca = temp[1]

您仔细阅读文件的每一行,将其拆分,提取ID号,然后用最新的ID覆盖以前的ID号。退出此循环时,ID_busca是一个简单的字符串,仅 last ID号。当您到达第二个循环的结尾时,您已经

ID_busca = "YP_207731.1"
ID_cello = "YP_207190.2"

现在,您遍历PSORT,依次提取每个ID。让我们看第一个:

ID_psort = "YP_008914846.1"

现在,您的列表理解会逐步遍历ID_busca的每个字符,以查看该字符是否在其他两个字符串中。

all = [i for i in ID_busca if i in ""YP_207190.2YP_008914846.1"]

学会一次编写少量代码。在您知道要搜索的ID列表之前,请勿尝试搜索ID。使用print语句。在您编写的所有内容均经过测试并按预期工作之前,请不要再编写任何代码。您发布的代码正在解决至少四个错误。

如果要获取ID编号列表,请 make 列出ID编号列表:搜索有关列表以及appendextend方法的在线教程。同时查找sets;我怀疑最简单的方法是制作三组ID号并简单地将它们的相交。