如何在Python3中替换完整的字符串而不仅仅是子字符串

时间:2017-09-14 10:02:43

标签: python

我有很多输入文件,需要在其中替换少量字符串。首先,我使用正则表达式创建了一个使用键值对的字典。字典包含键(要替换的字符串)和值(替换)。

输入文件中的示例行: 第一名学生的详细信息是FullName =“ABC XYZ KLM”FirstName =“ABC”ID =“123”

我的字典将是 - >

student = {
    'ABC':'Student Firstname',
    'ABC XYZ KLM':'Student Fullname',
    '123':'Student ID'
    }

我正在使用字符串replace()来替换,如下所示:

for line in inputfile1:
    for src, dst in student.items():
          line = line.replace(src,dst)

我的输出如下: 第一个学生的详细信息是FullName =“学生名字XYZ KLM ”FirstName =“学生名字”ID =“学生ID”

我在寻找的是: 第一个学生的详细信息是FullName =“学生姓名”FirstName =“学生姓名”ID =“学生ID”

你可以帮我解决这个问题吗?

1 个答案:

答案 0 :(得分:1)

这种情况正在发生,因为str.replace(..)首先替换ABC字符串。您需要确保首先替换最长的模式。 为此,您可以按照以下选项之一进行操作:

选项1:

请使用OrderedDict字典,并在最短的字段之前输入要替换的最长字符串:

In [3]: from collections import OrderedDict

In [6]: student = OrderedDict([('ABC XYZ KLM', 'Student Fullname'),  ('ABC', 'Student Firstname'),('123', 'Student ID')])

In [7]: student.items()
Out[7]: 
[('ABC XYZ KLM', 'Student Fullname'),
 ('ABC', 'Student Firstname'),
 ('123', 'Student ID')]

In [8]: line = 'FullName ="ABC XYZ KLM" FirstName ="ABC" ID = "123"' 

In [9]: for src, dst in student.items():
   ...:        line = line.replace(src, dst)
In [10]: line 
Out[10]: 'FullName ="Student Fullname" FirstName ="Student Firstname" ID = "Student ID"'

整体代码如下所示:

from collections import OrderedDict
student = OrderedDict([('ABC XYZ KLM', 'Student Fullname'),
                       ('ABC', 'Student Firstname'),
                    ('123', 'Student ID')])
line = 'FullName ="ABC XYZ KLM" FirstName ="ABC" ID = "123"' 
for src, dst in student.items():
    line = line.replace(src, dst)

选项2:

同样正如@AlexHal在下面的评论中所建议的那样,您可以简单地使用元组列表并根据替换前的最长模式对其进行排序,代码将如下所示:

In [2]: student = [('ABC', 'Student Firstname'),('123', 'Student ID'), ('ABC XYZ KLM', 'Student Fullname')]

In [3]: sorted(student, key=lambda x: len(x[0]), reverse=True)
Out[3]: 
[('ABC XYZ KLM', 'Student Fullname'),
 ('ABC', 'Student Firstname'),
 ('123', 'Student ID')]

In [4]: sorted(student, key=lambda x: len(x[0]), reverse=True)
Out[4]: 
[('ABC XYZ KLM', 'Student Fullname'),
 ('ABC', 'Student Firstname'),
 ('123', 'Student ID')]

In [9]: line = ' "Details of first student are FirstName ="ABC" FullName ="ABC XYZ KLM" ID = "123"'

In [10]: for src, dst in sorted(student, key=lambda x: len(x[0]), reverse=True):
    ...:     line = line.replace(src, dst)
    ...:     

In [11]: line
Out[11]: ' "Details of first student are FirstName ="Student Firstname" FullName ="Student Fullname" ID = "Student ID"'

In [12]: 

总代码:

student = [('ABC', 'Student Firstname'),
           ('123', 'Student ID'), 
           ('ABC XYZ KLM', 'Student Fullname')]

line = ' "Details of first student are FirstName ="ABC" FullName ="ABC XYZ KLM" ID = "123"'    
for src, dst in sorted(student, key=lambda x: len(x[0]), reverse=True):
    line = line.replace(src, dst)