我遇到了以下任务,根据多次出现的分隔符拆分文件 我有一个文件,其中包含以下数据:
Number of Employees - 95
==============================================================================
Telephone Number - 972123111111
Empl Name - David
Designation - Software Engineer
Address: **********
Doamin: Python
==============================================================================
Telephone Number - 972123111112
Empl Name - Glen
Designation - Software Engineer
Doamin: Python
==============================================================================
Telephone Number - 972123111111
Empl Name - Jhon
Designation - Software Engineer
Address: **********
Doamin: Python
==============================================================================
在这个文件中,我想在" ="之间拆分每个员工信息。然后打印所有员工的内容,如下所示:
Details of Employee: (Employee Name)
Telephone Number: (employee telephone number)
Designation : (employee desgination)
我已经编写了将文件中的数据提取到变量中的代码,并使用下面的正则表达式来获取数据,但无济于事:
re.findall('[^=]=*.*?[=*$]', a)
答案 0 :(得分:2)
使用re.split()
代替re.findall()
,如下所示:
re.findall(r'^=+$', a)
答案 1 :(得分:0)
尝试使用此代码段,它将所有员工数据存储为整个列表中的词典
import re
data_separator_regepx = "-|:" #theres - and : as separators in sample text
employee_separator_regexp ="^=+$"
employees = []
with open('test.txt') as f_in:
curr_employee = {}
for idx,line in enumerate(f_in):
if not idx : continue #skip first line
line = line.strip()
if not line: continue #skip empty lines
if re.match(employee_separator_regexp,line):
if curr_employee:
employees.append(curr_employee)
curr_employee = {}
else:
line = re.split(data_separator_regepx,line)
key, value = line[0],line[1]
curr_employee[key.strip()]=value.strip()
for employee in employees:
print "Details of Employee: ({})".format(employee.get('Empl Name',''))
print "Telephone Number: ({})".format(employee.get('Telephone Number',''))
print "Designation: ({})".format(employee.get('Designation',''))