我正在同时处理数百个文档(以JPG格式),并且提取了一些信息并将其保存到变量中。然后,我想将数据从for循环导出到Excel,并且可以正常工作。问题在于Excel中的数据与Python中的数据不一致。数据中的人没有正确的年龄,数字等。我希望年龄与正确的人匹配。我的代码如下:
import xlsxwriter
workbook = xlsxwriter.Workbook('Example.xlsx')
worksheet = workbook.add_worksheet("My sheet")
path = "1998/*.*"
for file in glob.glob(path):
a = cv2.imread(file)
my_list.append(a)
a = cv2.resize(a, None, fx=0.5, fy=0.5)
gray = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)
adaptive_threshold = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 85, 11)
config = "--psm 3"
test = pytesseract.image_to_string(adaptive_threshold, config=config, lang='swe')
# print(test)
row = 0
ro = 0
missing = "-"
# Name:
worksheet.write('A1', 'Name')
try:
domnr = re.compile("(Dom nr)(.*?\n)")
dom_nr = re.search(domnr, test).group(2)
print(dom_nr)
for i in dom_nr, missing:
if dom_nr in test:
worksheet.write(row, 1, dom_nr)
row += 1
else:
worksheet.write(row, 1, missing)
row += 1
except AttributeError:
print(missing)
#Age:
worksheet.write('B1', 'Age')
try:
m = re.compile("(Ålder)(.*?\n\n)")
mnr = re.search(m, test).group(2)
print(mnr)
for i in mnr, missing:
if mnr in test:
worksheet.write(ro, 2, mnr)
ro += 1
else:
worksheet.write(ro, 2, missing)
ro += 1
except AttributeError:
print(missing)
有人可以告诉我代码有什么问题吗?或有其他建议吗?
谢谢!