在python中以相同的行打印

时间:2014-10-05 15:38:02

标签: python python-2.5

我在python中很新,我需要你的帮助。

我有一个这样的文件:

>chr14_Gap_2
ACCGCGATGAAAGAGTCGGTGGTGGGCTCGTTCCGACGCGCATCCCCTGGAAGTCCTGCTCAATCAGGTGCCGGATGAAGGTGGT
GCTCCTCCAGGGGGCAGCAGCTTCTGCGCGTACAGCTGCCACAGCCCCTAGGACACCGTCTGGAAGAGCTCCGGCTCCTTCTTG
acacccaggactgatctcctttaggatggactggctggatcttcttgcagtccaaggggctctcaagagt
………..
>chr14_Gap_3
ACCGCGATGAAAGAGTCGGTGGTGGGCTCGTTCCGACGCGCATCCCCTGGAAGTCCTGCTCAATCAGGTGCCGGATGAAGGTGGT
GCTCCTCCAGGGGGCAGCAGCTTCTGCGCGTACAGCTGCCACAGCCCCTAGGACACCGTCTGGAAGAGCTCCGGCTCCTTCTTG
acacccaggactgatctcctttaggatggactggctggatcttcttgcagtccaaggggctctcaagagt
………..

一个字符串作为标记,一个字符串表示dna序列。 我想计算N个字母的数量和小写字母的数量并取百分比。 我编写了以下脚本,但是我在打印时遇到了问题。

#!/usr/bin/python

import sys


if len (sys.argv) != 2 :
  print "Usage: If you want to run this python script  you have to put the fasta file that     includes the desert area's sequences as arument"
  sys.exit (1)

  fasta_file = sys.argv[1]

#This script reads the sequences of the desert areas (fasta files) and calculates the   persentage of the Ns and the repeats.

 fasta_file = sys.argv[1]
 f = open(fasta_file, 'r')

content = f.readlines()
x = len(content)
#print x
for i in range(0,len(content)):
      if (i%2 == 0):
            content[i].strip()
            name = content[i].split(">")[1]
            print name,  #the "," makes the print command to avoid to print a new line
     else:
            content[i].strip()
            numberOfN = content[i].count('N')
            #print numberOfN
            allChar =  len(content[i])
            lowerChars = sum(1 for c in content[i] if c.islower())
            Ns_persentage = 100 * (numberOfN/float(allChar))
            lower_persentage = 100 * (lowerChars/float(allChar))
            waste = Ns_persentage + lower_persentage
            print ("The waste persentage is: %s" % (round(waste)))
            #print ("The persentage of Ns is: %s and the persentage of repeats is: %s" %   (Ns_persentage,lower_persentage))

    #print (name + waste)

问题是,它可以在第一行打印标签,在第二行打印废物变量,如下所示:

chr10_Gap_18759
The waste persentage is: 52.0

如何将它打印在同一行,标签分开?

例如

chr10_Gap_18759      52.0 
chr10_Gap_19000      78.0 
…….

非常感谢。

3 个答案:

答案 0 :(得分:1)

您可以使用以下方式打印:

print name, "\t", round(waste)

如果您使用的是python 2.X 我会对你的代码做一些修改。 python的argparse模块用于管理命令行中的参数。我会做这样的事情:

#!/usr/bin/python

import argparse
# To use the arguments 
parser = argparse.ArgumentParser()
parser.add_argument("fasta_file", help = "The fasta file to be processed ", type=str)
args = parser.parse_args()

f= open(args.fasta_file, "r")
content = f.readlines()
f.close()

x = len(content)
for i in range(x):
      line = content[i].strip()
      if (i%2 == 0):
          #The first time it will fail, for the next occasions it will be printed as you wish
            try:
                print bname, "\t", round(waste)
            except:
                pass
            name = line.split(">")[1]
     else:
            numberOfN = line.count('N')
            allChar =  len(line)
            lowerChars = sum(1 for c in content[i] if c.islower())
            Ns_persentage = 100 * (numberOfN/float(allChar))
            lower_persentage = 100 * (lowerChars/float(allChar))
            waste = Ns_persentage + lower_persentage
# To print the last case you need to do it outside the loop
print name, "\t", round(waste)

您还可以像print("{}\t{}".format(name, round(waste)))

中的其他答案一样打印它

我不确定使用i%2,请注意,如果序列使用奇数行,则在相同事件发生之前,您将无法获得下一个序列的名称。我会检查该行是否以“>”开头然后使用存储名称,并对下一行的字符求和。

答案 1 :(得分:0)

不要在name时打印(i%2 == 0),只需将其保存在变量中,然后在下一次迭代中与百分比一起打印:

 print("{0}\t{1}".format(name, round(waste)))
  

这种字符串格式化方法(new in version 2.6)是Python 3中的新标准,应该优先于新代码中String Formatting Operations中描述的%格式。

答案 2 :(得分:0)

我修复了缩进和冗余:

#!/usr/bin/python
"""
This script reads the sequences of the desert areas (fasta files) and calculates the percentage of the Ns and the repeats.
2014-10-05 v1.0 by Vasilis
2014-10-05 v1.1 by Llopis
2015-02-27 v1.2 by Cees Timmerman
"""

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("fasta_file", help="The fasta file to be processed.", type=str)
args = parser.parse_args()

with open(args.fasta_file, "r") as f:
    for line in f.readlines():
        line = line.strip()
        if line[0] == '>':
            name = line.split(">")[1]
            print name,
        else:
            numberOfN = line.count('N')
            allChar = len(line)
            lowerChars = sum(1 for c in line if c.islower())
            Ns_percentage = 100 * (numberOfN/float(allChar))
            lower_percentage = 100 * (lowerChars/float(allChar))
            waste = Ns_percentage + lower_percentage

            print "\t", round(waste)  # Note: https://docs.python.org/2/library/functions.html#round

美联储:

>chr14_Gap_2
ACCGCGATGAAAGAGTCGGTGGTGGGCTCGTTCCGACGCGCATCCCCTGGAAGTCCTGCTCAATCAGGTGCCGGATGAAGGTGGTGCTCCTCCAGGGGGCAGCAGCTTCTGCGCGTACAGCTGCCACAGCCCCTAGGACACCGTCTGGAAGAGCTCCGGCTCCTTCTTGacacccaggactgatctcctttaggatggactggctggatcttcttgcagtccaaggggctctcaagagt
>chr14_Gap_3
ACCGCGATGAAAGAGTCGGTGGTGGGCTCGTTCCGACGCGCATCCCCTGGAAGTCCTGCTCAATCAGGTGCCGGATGAAGGTGGTGCTCCTCCAGGGGGCAGCAGCTTCTGCGCGTACAGCTGCCACAGCCCCTAGGACACCGTCTGGAAGAGCTCCGGCTCCTTCTTGacacccaggactgatctcctttaggatggactggctggatcttcttgcagtccaaggggctctcaagagt

给出:

C:\Python27\python.exe -u "dna.py" fasta.txt
Process started >>>
chr14_Gap_2     29.0
chr14_Gap_3     29.0
<<< Process finished. (Exit code 0)

使用我最喜欢的Python IDE:Notepad++NppExec plugin