我正在使用Excel和xlwings。我有一个book.xlsm,在第一张表上有一个分配给以下vba代码的按钮:
book.xlsm!ThisWorkbook.get_data
在VBA上我添加了这个,当调用按钮并执行vba代码时,它会运行:
Sub get_data()
RunPython ("import my_script; my_script.get_data()")
End Sub
my_script如下:
import pandas as pd
from xlwings import Workbook, Range
def get_data():
wb = Workbook.caller()
df = pd.read_csv("data.csv")
Range("Sheet2", "A1").value = df
我遇到的问题如下:
pywintypes.com_error: (-2147024882, 'Not enough storage is available to complete this operation.', None, None)
data.csv文件有150000行和120行。使用较少的数据,它运行没有错误。
更新:目前还没有解决方案,但评论中提供了一种解决方法:https://github.com/ZoomerAnalytics/xlwings/issues/77
我使用以下内容:
df = pd.read_csv(csv_file, na_values={"", " ", "-"})
df.fillna("-", inplace=True)
startcell = 'A1'
chunk_size = 2500
if len(df) <= (chunk_size + 1):
Range(sheet_name, startcell, index=False).value = df
else: # chunk df and and dump each (default is 10k)\n",
c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I)
cL = c.group(1)
cN = int(c.group(2))
n = 0
for i in (df[rw:rw + chunk_size] for rw in xrange(0, len(df), chunk_size)):
if n == 0:
Range(sheet_name, cL + str(cN+n), index=False).value = i
cN += chunk_size
else:
Range(sheet_name, cL + str(cN+n)).value = i.values
cN += chunk_size
n += 1
我遇到的问题是,当我在工作表中插入数据时,在5002处有一个空行,再次在7503,10004 ....我意识到我的代码中有一个错误,但我可以&#39找不到。
答案 0 :(得分:2)
GitHub issue page上发布了一个解决方法功能。它将DataFrame分成较小的块并将它们插入Excel。不幸的是,正如您所注意到的,该函数被窃听并导致在行块之间出现空行。
我修改了这个功能,现在它正常运作。
# Dumps a large DataFrame in Excel via xlwings.
# Does not include headers.
def dump_largeDF(df, startcell='A1', chunk_size=100000):
if len(df) <= (chunk_size + 1):
Range(startcell, index=False, header=False).value = df
else: # Chunk df and and dump each
c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I) # A1
row = c.group(1) # A
col = int(c.group(2)) # 1
for chunk in (df[rw:rw + chunk_size] for rw in
range(0, len(df), chunk_size)):
print("Dumping chunk in %s%s" %(row, col))
Range(row + str(col), index=False, header=False).value = chunk
col += chunk_size
对我来说,100k的块大小是可以的,但是你可以根据需要改变它。
答案 1 :(得分:0)
抱歉复活旧帖子。
当我将上述函数作为来自另一个函数的调用运行时,我会收到各种错误,主要是Range
项。是否可以“独立”编写此功能,使其包含import
和目标wb
?我有:
def dump_largeDF(wb, df, sheetName, startcell, chunk_size):
import pandas as pd
import xlwings as xw
import re
if len(df) <= (chunk_size + 1):
wb.sheets(sheetName).Range(startcell, index=False, header=False).value = df
else: # Chunk df and and dump each
c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I) # A1
row = c.group(1) # A
col = int(c.group(2)) # 1
for chunk in (df[rw:rw + chunk_size] for rw in range(0, len(df), chunk_size)):
wb.sheets(sheetName).Range(row + str(col), index=False, header=False).value = chunk
col += chunk_size
答案 2 :(得分:0)
对于那些也希望正确处理标头而不依赖Range的人,我对代码进行了一些修改:
def dumpLargeDf(wb, df, startcell='A1', chunk_size=50000):
# Dumps a large DataFrame in Excel via xlwings. Takes care of header.
if len(df) <= (chunk_size + 1):
wb.sheets.active.range(startcell).options(index=False).value = df
else: # Chunk df and and dump each
c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I) # A1
row = c.group(1) # A
col = int(c.group(2)) # 1
useHeader = True
for chunk in (df[rw:rw + chunk_size] for rw in
range(0, len(df), chunk_size)):
print("Dumping chunk in %s%s" % (row, col))
wb.sheets.active.range(row + str(col)) \
.options(index=False, header=useHeader).value = chunk
useHeader = False
col += chunk_size