我在docx文件中有很多表,我试图从第一列的单元格中获取文本。
我有完整行搜索的代码
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
result = ReqRegex.search(paragraph.text)
if result:
file.write(result.group()+"\n")
但我正在尝试将其更改为仅检查第一列
for table in doc.tables:
for column in table.columns:
for cell in table.column_cells(0):
for paragraph in cell.paragraphs:
result = ReqRegex.search(paragraph.text)
if result:
file.write(result.group()+"\n")
你能告诉我我可以改变什么来使这段代码有用吗?
答案 0 :(得分:0)
我不熟悉使用python-docx,但是通过普通的python规则,这应该可行
for table in doc.tables:
for row in table.rows:
for paragraph in row.cells(0).paragraphs:
result = ReqRegex.search(paragraph.text)
if result:
file.write(result.group()+"\n")
答案 1 :(得分:0)
最后我解决了我的问题。也许这会对某人有用
for table in doc.tables:
rowNo = 0
for row in table.rows:
columnNo = 0
for cell in row.cells:
columnNo += 1
for paragraph in cell.paragraphs:
result = ReqRegex.search(paragraph.text)
if columnNo == 1:
print(cell.text)
if result:
file.write(result.group()+"\n")
rowNo += 1