我有很多输入文件,需要在其中替换少量字符串。首先,我使用正则表达式创建了一个使用键值对的字典。字典包含键(要替换的字符串)和值(替换)。
输入文件中的示例行: 第一名学生的详细信息是FullName =“ABC XYZ KLM”FirstName =“ABC”ID =“123”
我的字典将是 - >
student = {
'ABC':'Student Firstname',
'ABC XYZ KLM':'Student Fullname',
'123':'Student ID'
}
我正在使用字符串replace()来替换,如下所示:
for line in inputfile1:
for src, dst in student.items():
line = line.replace(src,dst)
我的输出如下: 第一个学生的详细信息是FullName =“学生名字XYZ KLM ”FirstName =“学生名字”ID =“学生ID”
我在寻找的是: 第一个学生的详细信息是FullName =“学生姓名”FirstName =“学生姓名”ID =“学生ID”
你可以帮我解决这个问题吗?
答案 0 :(得分:1)
这种情况正在发生,因为str.replace(..)
首先替换ABC
字符串。您需要确保首先替换最长的模式。
为此,您可以按照以下选项之一进行操作:
请使用OrderedDict
字典,并在最短的字段之前输入要替换的最长字符串:
In [3]: from collections import OrderedDict
In [6]: student = OrderedDict([('ABC XYZ KLM', 'Student Fullname'), ('ABC', 'Student Firstname'),('123', 'Student ID')])
In [7]: student.items()
Out[7]:
[('ABC XYZ KLM', 'Student Fullname'),
('ABC', 'Student Firstname'),
('123', 'Student ID')]
In [8]: line = 'FullName ="ABC XYZ KLM" FirstName ="ABC" ID = "123"'
In [9]: for src, dst in student.items():
...: line = line.replace(src, dst)
In [10]: line
Out[10]: 'FullName ="Student Fullname" FirstName ="Student Firstname" ID = "Student ID"'
整体代码如下所示:
from collections import OrderedDict
student = OrderedDict([('ABC XYZ KLM', 'Student Fullname'),
('ABC', 'Student Firstname'),
('123', 'Student ID')])
line = 'FullName ="ABC XYZ KLM" FirstName ="ABC" ID = "123"'
for src, dst in student.items():
line = line.replace(src, dst)
同样正如@AlexHal在下面的评论中所建议的那样,您可以简单地使用元组列表并根据替换前的最长模式对其进行排序,代码将如下所示:
In [2]: student = [('ABC', 'Student Firstname'),('123', 'Student ID'), ('ABC XYZ KLM', 'Student Fullname')]
In [3]: sorted(student, key=lambda x: len(x[0]), reverse=True)
Out[3]:
[('ABC XYZ KLM', 'Student Fullname'),
('ABC', 'Student Firstname'),
('123', 'Student ID')]
In [4]: sorted(student, key=lambda x: len(x[0]), reverse=True)
Out[4]:
[('ABC XYZ KLM', 'Student Fullname'),
('ABC', 'Student Firstname'),
('123', 'Student ID')]
In [9]: line = ' "Details of first student are FirstName ="ABC" FullName ="ABC XYZ KLM" ID = "123"'
In [10]: for src, dst in sorted(student, key=lambda x: len(x[0]), reverse=True):
...: line = line.replace(src, dst)
...:
In [11]: line
Out[11]: ' "Details of first student are FirstName ="Student Firstname" FullName ="Student Fullname" ID = "Student ID"'
In [12]:
总代码:
student = [('ABC', 'Student Firstname'),
('123', 'Student ID'),
('ABC XYZ KLM', 'Student Fullname')]
line = ' "Details of first student are FirstName ="ABC" FullName ="ABC XYZ KLM" ID = "123"'
for src, dst in sorted(student, key=lambda x: len(x[0]), reverse=True):
line = line.replace(src, dst)