用字典值替换文件中的字符串

时间:2013-05-30 17:56:24

标签: python regex dictionary

我有一个文件(usearch.txt),其条目如下所示:

0 AM158981
0 AM158980
0 AM158982

我想将此文件中的入藏号(AM158981等)替换为与其对应的细菌名称,这些名称位于第二个文件(acs.txt)中:

AM158981布鲁氏菌(Brucella pinnipedialis)布鲁氏菌(Brucellaceae) AM158980 Brucella suis Brucellaceae
AM158982布鲁氏菌(Brucella ceti Brucellaceae) 等

我的计划是使用第二个文件(登录号作为键,名称作为值)制作字典,然后打开第一个文件并使用字典替换入藏号并将其保存到新文件( done.txt):

#! /usr/bin/env python
import re
# Creates a dictionary for accession numbers

fname = r"acs.txt"

namer = {}
for line in open(fname):
        acs, name = line.split(" ",1)
        namer[acs] = str(name)

infilename = "usearch.txt"
outfilename = "done.txt"

regex = re.compile(r'\d+\s(\w+)')

with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
    for line in infile:
        x = regex.sub(r'\1', namer(name), line)

        outfile.write(x) 

运行此脚本时出现此错误:Traceback(最近一次调用最后一次):

  File "nameit.py", line 21, in <module>
  x = regex.sub(r'\1', namer(name), line)
  TypeError: 'dict' object is not callable

理想情况下,我的“done.txt”文件如下所示:

0布鲁氏菌(Brucella pinnipedialis)布鲁氏菌(Brucellaceae) 0布鲁氏菌(Brucella suis)布鲁氏菌(Brucellaceae) 0 Brucella ceti Brucellaceae

1 个答案:

答案 0 :(得分:1)

您尝试使用namer方法:

x = regex.sub(r'\1', namer(name), line)

您想要用括号替换括号以使用键name访问该元素:

x = regex.sub(r'\1', namer[name], line)

请注意,您还需要重新获取名称,否则您将一遍又一遍地使用相同的密钥:

with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
    for line in infile:
        # Need to get the ID for the bacteria in question. If we don't, everything
        # will end up with the same name in our output file.
        _, name = line.split(" ", 1)

        # Strip the newline character
        name = name.strip()

        x = regex.sub(r'\1', namer[name], line)
        outfile.write(x)