我有一个文件(usearch.txt),其条目如下所示:
0 AM158981
0 AM158980
0 AM158982
等
我想将此文件中的入藏号(AM158981等)替换为与其对应的细菌名称,这些名称位于第二个文件(acs.txt)中:
AM158981布鲁氏菌(Brucella pinnipedialis)布鲁氏菌(Brucellaceae)
AM158980 Brucella suis Brucellaceae
AM158982布鲁氏菌(Brucella ceti Brucellaceae)
等
我的计划是使用第二个文件(登录号作为键,名称作为值)制作字典,然后打开第一个文件并使用字典替换入藏号并将其保存到新文件( done.txt):
#! /usr/bin/env python
import re
# Creates a dictionary for accession numbers
fname = r"acs.txt"
namer = {}
for line in open(fname):
acs, name = line.split(" ",1)
namer[acs] = str(name)
infilename = "usearch.txt"
outfilename = "done.txt"
regex = re.compile(r'\d+\s(\w+)')
with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
for line in infile:
x = regex.sub(r'\1', namer(name), line)
outfile.write(x)
运行此脚本时出现此错误:Traceback(最近一次调用最后一次):
File "nameit.py", line 21, in <module>
x = regex.sub(r'\1', namer(name), line)
TypeError: 'dict' object is not callable
理想情况下,我的“done.txt”文件如下所示:
0布鲁氏菌(Brucella pinnipedialis)布鲁氏菌(Brucellaceae) 0布鲁氏菌(Brucella suis)布鲁氏菌(Brucellaceae) 0 Brucella ceti Brucellaceae
答案 0 :(得分:1)
您尝试使用namer
方法:
x = regex.sub(r'\1', namer(name), line)
您想要用括号替换括号以使用键name
访问该元素:
x = regex.sub(r'\1', namer[name], line)
请注意,您还需要重新获取名称,否则您将一遍又一遍地使用相同的密钥:
with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
for line in infile:
# Need to get the ID for the bacteria in question. If we don't, everything
# will end up with the same name in our output file.
_, name = line.split(" ", 1)
# Strip the newline character
name = name.strip()
x = regex.sub(r'\1', namer[name], line)
outfile.write(x)