我整天都在为这段代码而苦苦挣扎。在循环的每次运行期间,都会从不同的MS Word文件中读取表。将表复制到数据框,然后将其复制到Excel文件中的一行。
在for循环的每次后续运行中,Excel行都会递增,因此可以将新数据帧写入新行,但是在文件执行后,仅一行会显示一个数据帧。
当我打印(tfile)时,得到以下..(“ CIV-ASCS-016_TRS.docx”,“ CIV-ASCS-018_TRS .docx”,“ CIV-ASCS-020_TRS.docx”,“ CIV- ASCS-021_TRS .docx')这证明循环基于目录中的4个文件运行了4次。我在for循环之外将初始行pos设置为0。
注意:关于导入必要的库,我没有显示任何代码行。
files = glob('*.docx')
pos = 1
for i, wfile in enumerate(files[:1]):
document = Document(wfile)
table = document.tables[0]
data = []
keys = {}
for j, row in enumerate(table.rows):
text = (cell.text for cell in row.cells)
if j == 0:
keys = tuple(text)
continue
row_data = dict(zip(keys, text))
data.append(row_data)
tfile = tuple(files)
df = pd.DataFrame(data)
df.loc[-1] = [wfile, 'Test Case ID']
df.index = df.index + 1 # shifting index
df = df.sort_index() # sorting by index
df1 = df.rename(index=str, columns={"Test Case ID": "TC Attributes"})
df21 = df1.drop(columns = ['TC Attributes'])
df3 = df21.T
# read the existing sheets so that openpyxl won't create a new one later
book = load_workbook('test.xlsx')
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df3.to_excel(writer, 'sheet7', header = False, index = False, \
startrow = pos)
pos += 1
writer.save()