我有很多excel文件需要编译成一个excel文件,然后将编译后的文件复制到某个工作表中的现有excel文件(带宏/ .xlsm)。
我解决了第一个问题(将多个excel文件编译成一个excel文件)。结果数据框以.csv格式保存。结果文件如下所示。
直到这里没有问题。下一步我正在努力寻找如何做到这一点。
从结果数据框中,我想将数据帧“复制并粘贴”到相应标题中“Source”表中的宏(.xlsm)的现有excel文件中。现有的excel文件如下所示。
正如您在上图中所看到的,我想跳过在A列中写入任何数据,因为此列中的单元格充满了公式。我想在现有excel文件中将列B中的结果数据帧写入列Q.但是,在写入数据之前,我想删除所有单元格中的所有现有数据(A列中的单元格除外)。
基本上我想做以下事情:
非常感谢任何反馈!谢谢!
此致
阿诺德
答案 0 :(得分:1)
我发现了基于openpyxl的以下解决方案。我了解到xlsxwriter无法打开现有的excel文件。因此,我的方法基于openpyxl。
import pandas as pd
import openpyxl # one excel reader/writer compatible with pandas
book = openpyxl.load_workbook('input.xlsm', keep_vba = True) # Load existing .xlsm file
with pd.ExcelWriter('output.xlsm', engine='openpyxl') as writer: # open a writer instance with the filename of the
writer.book = book # Hand over input workbook
writer.sheets = dict((ws.title, ws) for ws in book.worksheets) # Hand over worksheets
writer.vba_archive = book.vba_archive # Hand over VBA information
df_write.to_excel(writer, sheet_name = 'Sheet1', columns = ['A'],
header = False, index = False,
startrow = 1, startcol = 0)
# Writes a column A of the Dataframe into the excel sheet 'Sheet1', which can
# already be present in input.xlsm, to row 1, col 0
writer.save()
答案 1 :(得分:0)
由于可以使用QueryTables使用Excel VBA宏处理csv导入电子表格,因此请考虑让Python将带有COM接口的VBA复制到Excel对象库。之前的所有宏代码都保持不变,因为没有任何内容被覆盖但是单元格数注意:以下假设您使用的是Excel for Windows。
使用win32com
库,Python几乎可以复制VBA所做的任何事情。实际上,您将了解VBA是Office应用程序中的附加引用,而不是本机内置对象,并且执行相同的COM接口!请参阅IDE中Tools\References
中的第一个选定项。
import pandas as pd
import win32com.client as win32
# ...same pandas code...
macrofile = "C:\\Path\\To\\Macro\\Workbook.xlsm"
strfile = "C:\\Path\\To\\CSV\\Output.csv"
df.to_csv(strfile)
try:
xl = win32.gencache.EnsureDispatch('Excel.Application')
wb = xl.Workbooks.Open(macrofile)
# DELETE PREVIOUS DATA
wb.Sheets("Source").Range("B:Q").EntireColumn.Delete()
# ADD QUERYTABLE (SPECIFYING DESTINATION CELL START)
qt = wb.Sheets("Source").QueryTables.Add(Connection="TEXT;" + strfile,
Destination=wb.Sheets(1).Cells(2, 2))
qt.TextFileParseType = 1
qt.TextFileConsecutiveDelimiter = False
qt.TextFileTabDelimiter = False
qt.TextFileSemicolonDelimiter = False
qt.TextFileCommaDelimiter = True
qt.TextFileSpaceDelimiter = False
qt.Refresh(BackgroundQuery=False)
# REMOVE QUERYTABLE
for qt in wb.Sheets("Source").QueryTables:
qt.Delete()
# CLOSES WORKBOOK AND SAVES CHANGES
wb.Close(True)
except Exception as e:
print(e)
finally:
qt = None
wb = None
xl = None
或者,在VBA中创建一个新宏(放在独立模块中)并让Python调用它,将csv文件路径作为参数传递:
<强> VBA 强>
Public Sub ImportCSV(strfile As String)
Dim qt As QueryTable
ThisWorkbook.Sheets("Source").Range("B:Q").EntireColumn.Delete
' ADD QUERYTABLE
With ThisWorkbook.Sheets("Source").QueryTables.Add(Connection:="TEXT;" & strfile, _
Destination:=ThisWorkbook.Sheets(1).Cells(2, 2))
.TextFileParseType = xlDelimited
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileSpaceDelimiter = False
.Refresh BackgroundQuery:=False
End With
' REMOVE QUERYTABLE
For Each qt In ThisWorkbook.Sheets(1).QueryTables
qt.Delete
Next qt
Set qt = Nothing
End Sub
<强>的Python 强>
import pandas as pd
import win32com.client as win32
# ...same pandas code...
macrofile = "C:\\Path\\To\\Macro\\Workbook.xlsm"
strfile = "C:\\Path\\To\\CSV\\Output.csv"
df.to_csv(strfile)
try:
xl = win32.gencache.EnsureDispatch('Excel.Application')
wb = xl.Workbooks.Open(macrofile)
xl.Application.Run('ImportCSV', strfile)
wb.Close(True)
xl.Quit
except Exception as e:
print(e)
finally:
wb = None
xl = None
答案 2 :(得分:0)
抱歉有点迟到回来更新我的问题。最后我用openpyxl包解决了我的问题。
所以这是我的最终代码:
import openpyxl
import os
import string
import pandas as pd
import numpy as np
path = #folder directory
target_file = #excel filename
sheetname = #working sheet that you wish to work on with
filename = os.path.join(path, target_file)
wb = openpyxl.load_workbook(filename, keep_vba=True)
sheet = wb.get_sheet_by_name(sheetname)
# To Erase All Values within Selected Columns
d = dict()
for x, y in zip(range(1, 27), string.ascii_lowercase):
d[x] = y.upper()
max_row = sheet.max_row
max_col = sheet.max_column
for row in range(max_row):
row += 1
if row == 1: continue
for col in range(max_col):
col += 1
if col == 1: continue
sheet['{}{}'.format(d[col], row)] = None
# To Write Values to the Blank Worksheet
path_dataframe = # folder directory to the csv file
target_compiled = # csv filename
filename_compiled = os.path.join(path_compiled, target_compiled)
compiled = pd.read_csv(filename_compiled, low_memory=False, encoding = "ISO-8859-1")
for row in range(len(compiled.index)):
row += 1
if row == 1: continue # I do not want to change the value in row 1 in excel file because they are headers
for col in range(max_col):
col += 1
if col == 1: continue # I do not want to change the values in column 1 in excel file since they contain formula
value = compiled.iloc[row-2][col-2]
if type(value) is str: value = value
elif type(value) is np.float64: value = float(value)
elif type(value) is np.int64: value = int(value)
sheet['{}{}'.format(d[col], row)] = value
wb.save(filename)