我正在尝试查看文件。如果该行以“ SegID”开头,那么我想看一下其后的第21行,如果该行以“ Cytoplasmic”以外的其他内容开头,我想编写以SegID开头的该行以及以“ Cytoplasmic”以外的任何内容开头的行”到文件。
到目前为止,我有这个:
import sys
import argparse
import operator
import re
import itertools
def main (argv):
parser = argparse.ArgumentParser(description='find a location')
parser.add_argument('infile', help='file to process')
parser.add_argument('outfile', help='file to produce')
args = parser.parse_args()
tag = "SeqID:"
tag2 = "Cytoplasmic"
with open(args.infile, "r") as f,open(args.outfile,"w+") as of:
file_in = f.readlines()
for line in file_in:
if line.startswith(tag)and line[21:] != "Cytoplasmic":
of.write(line)
if __name__ == "__main__":
main(sys.arg
以下是输入文件的示例:
SeqID: YP_008914846.1 opacity protein [Neisseria gonorrhoeae FA 1090]
Analysis Report:
CMSVM- Unknown [No details]
CytoSVM- Unknown [No details]
ECSVM- Unknown [No details]
ModHMM- Unknown [No internal helices found]
Motif- Unknown [No motifs found]
OMPMotif- Unknown [No motifs found]
OMSVM- OuterMembrane [No details]
PPSVM- Unknown [No details]
Profile- Unknown [No matches to profiles found]
SCL-BLAST- OuterMembrane [matched 60392864: Opacity protein opA54 precursor]
SCL-BLASTe- Unknown [No matches against database]
Signal- Unknown [No signal peptide detected]
Localisation Scores:
OuterMembrane 10.00
Extracellular 0.00
Periplasmic 0.00
Cytoplasmic 0.00
CytoplasmicMembrane 0.00
Final Prediction:
OuterMembrane 10.00
-------------------------------------------------------------------------------
SeqID: YP_008914847.1 hypothetical protein NGO0146a [Neisseria gonorrhoeae FA 1090]
Analysis Report:
CMSVM- Unknown [No details]
CytoSVM- Unknown [No details]
ECSVM- Unknown [No details]
ModHMM- Unknown [No internal helices found]
Motif- Unknown [No motifs found]
OMPMotif- Unknown [No motifs found]
OMSVM- Unknown [No details]
PPSVM- Unknown [No details]
Profile- Unknown [No matches to profiles found]
SCL-BLAST- Unknown [No matches against database]
SCL-BLASTe- Unknown [No matches against database]
Signal- Unknown [No signal peptide detected]
Localization Scores:
CytoplasmicMembrane 2.00
Cytoplasmic 2.00
OuterMembrane 2.00
Periplasmic 2.00
Extracellular 2.00
Final Prediction:
Unknown
答案 0 :(得分:1)
我的Python有点生锈,所以请原谅。我希望我可以正确地推断出所需的输出,否则请发表评论。
这假设您测序实验中的样本始终被3行任意内容的偏移分开,每个样本有22行。
Init()
答案 1 :(得分:1)
您可以尝试使用以下内容:
with open('credentials.json', "r") as f:
file_in = f.readlines()
for i,line in enumerate(file_in):
if line.startswith(tag) and \
(i+21)< len(file_in) and \
not(file_in[i+21].strip().startswith("Cytoplasmic")):
of.write(line)
of.write(file_in[i+21])