我试图使用Python从存档中提取一些信息。该档案的一部分是:
1. [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array
(Submitter supplied) Affymetrix submissions are replicated on the GeneChip Human Genome U133 Plus 2.0 Array. more...
Organism: Homo sapiens
527 DataSets 4123 Series 54 Related Platforms 115874 Samples
FTP download: GEO ftp://ftp.ncbi.nlm.nih.gov/geo/platforms/GPLnnn/GPL570/
Platform Accession: GPL570 ID: 100000570
2. [Mouse430_2] Affymetrix Mouse Genome 430 2.0 Array
(Submitter supplied) Affymetrix submissions are typically Array. more...
Organism: Mus musculus
517 DataSets 3529 Series 36 Related Platforms 46528 Samples
FTP download: GEO ftp://ftp.ncbi.nlm.nih.gov/geo/platforms/GPL1nnn/GPL1261/
Platform Accession: GPL1261 ID: 100001261
import re
import sys
import itertools
stdout = open("results.txt", "w")
pattern = re.compile(r'^\d+[.]\s')
pattern2 = re.compile(r'Organism:')
pattern3 = re.compile(r'FTP download:')
pattern4 = re.compile(r'ID: ')
listOrg = []
def group_separator(line):
return line=='ID: '
with open('Microarray/PlatformsMicroarray.txt') as f:
for key,group in itertools.groupby(f,group_separator):
# print(key,list(group)) # uncomment to see what itertools.groupby does.
if not key:
data={}
for item in group:
for line in f:
if pattern.search(line):
listOrg.append(line)
if pattern2.search(line):
#field,value=line.split(':')
listOrg.append(line)
if pattern3.search(line):
listOrg.append(line)
if pattern4.search(line):
listOrg.append(line)
for item in listOrg:
stdout.write("%s" % item)
stdout.close()
如何连接信息以便在.csv中编写存档?
答案 0 :(得分:2)
csv
模块是您的首选武器。
with open("path/to/out.csv", "wb") as out:
writer = csv.writer(out)
for line in whatever_your_input_is:
writer.writerow(line)
在这种情况下看起来listOrg
是您的解析输入,所以您要做
...
for line in listOrg:
writer.writerow(line)