我正在尝试提取关键字(Exhibit)旁边的数字(nn.nn)的所有匹配项。例如,
through April 25, 2012
through April 25, 2012
Exhibit 99.6
Exhibit 99.10
这是我的代码。
import os,re
import numpy as np
os.chdir('C:\\Users\\dul\\Dropbox\\CTO\\test')
def extract_data(filename):
with open(filename, 'r') as file1:
text1=file1.read()
matchexh = re.findall(r'Exhibit (\d+).(\d+)',text1)
with open('outfile.txt', "a+") as outfile:
outfile.write("\n"+matchexh)
files= os.listdir("C:\\Users\\dul\\Dropbox\\CTO\\test")
for file in files:
if ".txt" in file:
extract_data(file)
运行此命令时,出现错误消息
File "C:\Users\dul\Dropbox\CTO\test\exhibitno.py", line 13, in extract_data outfile.write("\n"+matchexh) TypeError: cannot concatenate 'str' and 'list' objects
如何获取所有匹配项并将其列出?
答案 0 :(得分:1)
更改此:
matchexh = re.search(r'Exhibit (\d+).(\d+)',text1).group().strip()
收件人:
matchexh = re.findall(r'Exhibit (\d+).(\d+)',text1)