在python elif语句中添加列表

时间:2018-08-17 13:54:12

标签: python-2.7

我有两个数据文件(datafile1和datafile2),我想将一些信息从datafile2添加到datafile1,但前提是要满足某些要求,然后将所有信息写入新文件。

这是datafile1的示例(我更改了选项卡,以便于查看):

#OTU    S1  S2  S3  S4  S5  S6  S7  S8  S9  S10 S11 S12 S13 S14 S15 S16 S17 S18 Seq
OTU49   0   0   0   0   0   16  0   0   0   0   0   0   1   0   0   0   0   0   catat
OTU171  5   2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   gattt
OTU803  0   0   0   0   0   0   0   0   0   0   0   6   0   0   0   0   0   0   aactt
OTU2519 0   0   0   0   0   0   0   6   0   0   0   0   0   0   0   0   0   0   aattt

以下是datafile2的示例:

#GInumber   OTU     Accssn      Ident   Len M   Gap Qs  Qe  Ss  Se  evalue      bit phylum      class       order           family          genus       species
1366104624  OTU49   MG926900    82.911  158 23  4   2   157 18  173 2.17e-29    139 Arthropoda  Insecta     Hymenoptera     Braconidae      Leiophron   NA
342734543   OTU171  JN305047    95.513  156 7   0   2   157 23  178 9.63e-63    250 Arthropoda  Insecta     Lepidoptera     Limacodidae     Euphobetron Euphobetron cupreitincta
290756623   OTU803  GU580785    96.753  154 5   0   4   157 10  163 5.75e-65    257 Arthropoda  Insecta     Lepidoptera     Geometridae     Apocheima   Apocheima pilosaria
296792336   OTU2519 GU688553    98.039  153 3   0   1   153 18  170 9.56e-68    267 Arthropoda  Insecta     Lepidoptera     Geometridae     Operophtera     Operophtera brumata

我要对datafile1的每一行进行操作,在datafile2中找到具有相同“ OTU”的行,并从datafile 2中始终添加GInumber,Accsn,Ident,Len,M,Gap,Qs,Qe, Ss,Se,evalue,bit,门和类。如果Ident介于某些数字之间,那么我还要根据以下条件添加顺序,科,属和物种:

Case #1: Ident > 98.0, add order, family, genus, and species
Case #2: Ident between 96.5 and 98.0, add order, family, "NA", "NA"
Case #3: Ident between 95.0 and 96.5, add order, "NA", "NA", "NA"
Case #4: Ident < 95.0 add "NA", "NA", "NA", "NA"

所需的输出为:

#OTU    S1  S2  S3  S4  S5  S6  S7  S8  S9  S10 S11 S12 S13 S14 S15 S16 S17 S18 Seq     GInumber    Accssn      Ident   Len M   Gap Qs  Qe  Ss  Se  evalue      bit phylum      class       order           family          genus       species
OTU49   0   0   0   0   0   16  0   0   0   0   0   0   1   0   0   0   0   0   catat   1366104624  MG926900    82.911  158 23  4   2   157 18  173 2.17e-29    139 Arthropoda  Insecta     NA  NA  NA  NA
OTU171  5   2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   gattt   342734543   JN305047    95.513  156 7   0   2   157 23  178 9.63e-63    250 Arthropoda  Insecta     Lepidoptera     NA  NA  NA
OTU803  0   0   0   0   0   0   0   0   0   0   0   6   0   0   0   0   0   0   aactt   290756623   GU580785    96.753  154 5   0   4   157 10  163 5.75e-65    257 Arthropoda  Insecta     Lepidoptera     Geometridae     NA  NA
OTU2519 0   0   0   0   0   0   0   6   0   0   0   0   0   0   0   0   0   0   aattt   296792336   GU688553    98.039  153 3   0   1   153 18  170 9.56e-68    267 Arthropoda  Insecta     Lepidoptera     Geometridae     Operophtera     Operophtera brumata 

我写了这个脚本:

import csv

#Files
besthit_taxonomy_unique_file = "datafile2.txt"
OTUtablefile = "datafile1.txt"
outputfile = "outputfile.txt"

#Settings
OrderLevel = float(95.0)
FamilyLevel = float(96.5)
SpeciesLevel = float(98.0)


#Importing the OTU table, which is tab delimited
OTUtable = list(csv.reader(open(OTUtablefile, 'rU'), delimiter='\t'))
headerOTUs = OTUtable.pop(0)

#Importing the best hit taxonomy table, which is tab delimited
taxonomytable = list(csv.reader(open(besthit_taxonomy_unique_file, 'rU'), delimiter='\t')) 
headertax = taxonomytable.pop(0)
headertax.pop(1)

#Getting the header info
totalheader  = headerOTUs + headertax


#Merging and assigning the taxonomy at the appropriate level
outputtable = []
NAs = 4 * ["NA"]  #This is a list of NAs so that I can add the appropriate number, depending on the Identity
for item in OTUtable:
    OTU = item #Just to prevent issues with the list of lists
    OTUIDtable = OTU[0]
    print OTUIDtable
    for thing in taxonomytable:
        row = thing #Just to prevent issues with the list of lists
        OTUIDtax = row[1]
        if OTUIDtable == OTUIDtax:
            OTU.append(row[0])
            OTU += row[2:15]
            PercentID = float(row[3])
            if PercentID >= SpeciesLevel:   
                OTU += row[15:]
            elif FamilyLevel <= PercentID < SpeciesLevel:
                OTU += row[15:17]
                OTU += NAs[:2]
            elif OrderLevel <= PercentID < FamilyLevel:
                print row[15]
                OTU += row[15]
                OTU += NAs[:3]
            else:
                OTU += NAs  
    outputtable.append(OTU)             

#Writing the output file                    
f1 = open(outputfile, 'w')
for item in totalheader[0:-1]:
    f1.write(str(item) + '\t')
f1.write(str(totalheader[-1]) + '\n')   
for row in outputtable:
    currentrow = row
    for item in currentrow[0:-1]:
        f1.write(str(item) + '\t')
    f1.write(str(currentrow[-1]) + '\n')    

在大多数情况下,输出是正确的,除了情况#3(标识在95和96.5之间)之外,当脚本输出命令的条目时,每个字母之间都有一个制表符。

以下是输出示例:

#OTU    S1  S2  S3  S4  S5  S6  S7  S8  S9  S10 S11 S12 S13 S14 S15 S16 S17 S18 Seq     GInumber    Accssn      Ident   Len M   Gap Qs  Qe  Ss  Se  evalue      bit phylum      class       order           family          genus       species
OTU49   0   0   0   0   0   16  0   0   0   0   0   0   1   0   0   0   0   0   catat   1366104624  MG926900    82.911  158 23  4   2   157 18  173 2.17e-29    139 Arthropoda  Insecta     NA  NA  NA  NA
OTU171  5   2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   gattt   342734543   JN305047    95.513  156 7   0   2   157 23  178 9.63e-63    250 Arthropoda  Insecta     L   e   p   i   d   o   p   t   e   r   a       NA  NA  NA
OTU803  0   0   0   0   0   0   0   0   0   0   0   6   0   0   0   0   0   0   aactt   290756623   GU580785    96.753  154 5   0   4   157 10  163 5.75e-65    257 Arthropoda  Insecta     Lepidoptera     Geometridae     NA  NA
OTU2519 0   0   0   0   0   0   0   6   0   0   0   0   0   0   0   0   0   0   aattt   296792336   GU688553    98.039  153 3   0   1   153 18  170 9.56e-68    267 Arthropoda  Insecta     Lepidoptera     Geometridae     Operophtera     Operophtera brumata 

我只是不知道出了什么问题。在其余时间中,订单似乎包含正确的信息,但是在这种情况下,订单中的信息似乎存储为列表列表。但是,屏幕上的输出是这样的:

OTU171
Lepidoptera

这似乎并不表示列表的列表...

任何见解我都会很高兴。如果有人有使我的代码更具pythonic功能的想法,我也将不胜感激。

Andreanna

0 个答案:

没有答案