如何在没有pywintypes.com_error的情况下使用xlwings在Excel中插入数据框?

时间:2014-11-30 11:18:19

标签: python excel pandas xlwings

我正在使用Excel和xlwings。我有一个book.xlsm,在第一张表上有一个分配给以下vba代码的按钮:

book.xlsm!ThisWorkbook.get_data

在VBA上我添加了这个,当调用按钮并执行vba代码时,它会运行:

Sub get_data()
    RunPython ("import my_script; my_script.get_data()")
End Sub

my_script如下:

import pandas as pd
from xlwings import Workbook, Range

def get_data():
    wb = Workbook.caller()

    df = pd.read_csv("data.csv")
    Range("Sheet2", "A1").value = df

我遇到的问题如下:

pywintypes.com_error: (-2147024882, 'Not enough storage is available to complete this operation.', None, None)

data.csv文件有150000行和120行。使用较少的数据,它运行没有错误。

更新:目前还没有解决方案,但评论中提供了一种解决方法:https://github.com/ZoomerAnalytics/xlwings/issues/77

我使用以下内容:

df = pd.read_csv(csv_file, na_values={"", " ", "-"})
df.fillna("-", inplace=True)
startcell = 'A1'
chunk_size = 2500
if len(df) <= (chunk_size + 1):
    Range(sheet_name, startcell, index=False).value = df
else:  # chunk df and and dump each (default is 10k)\n",
    c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I)
    cL = c.group(1)
    cN = int(c.group(2))
    n = 0
    for i in (df[rw:rw + chunk_size] for rw in xrange(0, len(df), chunk_size)):
        if n == 0:
            Range(sheet_name, cL + str(cN+n), index=False).value = i
            cN += chunk_size
        else:
            Range(sheet_name, cL + str(cN+n)).value = i.values
            cN += chunk_size
        n += 1

我遇到的问题是,当我在工作表中插入数据时,在5002处有一个空行,再次在7503,10004 ....我意识到我的代码中有一个错误,但我可以&#39找不到。

3 个答案:

答案 0 :(得分:2)

GitHub issue page上发布了一个解决方法功能。它将DataFrame分成较小的块并将它们插入Excel。不幸的是,正如您所注意到的,该函数被窃听并导致在行块之间出现空行。

我修改了这个功能,现在它正常运作。

# Dumps a large DataFrame in Excel via xlwings.
# Does not include headers.
def dump_largeDF(df, startcell='A1', chunk_size=100000):
    if len(df) <= (chunk_size + 1):
        Range(startcell, index=False, header=False).value = df
    else: # Chunk df and and dump each
        c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I) # A1
        row = c.group(1) # A
        col = int(c.group(2)) # 1
        for chunk in (df[rw:rw + chunk_size] for rw in
                      range(0, len(df), chunk_size)):
            print("Dumping chunk in %s%s" %(row, col))
            Range(row + str(col), index=False, header=False).value = chunk
            col += chunk_size

对我来说,100k的块大小是可以的,但是你可以根据需要改变它。

答案 1 :(得分:0)

抱歉复活旧帖子。

当我将上述函数作为来自另一个函数的调用运行时,我会收到各种错误,主要是Range项。是否可以“独立”编写此功能,使其包含import和目标wb?我有:

def dump_largeDF(wb, df, sheetName, startcell, chunk_size):
    import pandas as pd
    import xlwings as xw
    import re

    if len(df) <= (chunk_size + 1):
        wb.sheets(sheetName).Range(startcell, index=False, header=False).value = df
    else: # Chunk df and and dump each
        c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I) # A1
        row = c.group(1) # A
        col = int(c.group(2)) # 1
        for chunk in (df[rw:rw + chunk_size] for rw in range(0, len(df), chunk_size)):
            wb.sheets(sheetName).Range(row + str(col), index=False, header=False).value = chunk
            col += chunk_size

答案 2 :(得分:0)

对于那些也希望正确处理标头而不依赖Range的人,我对代码进行了一些修改:

def dumpLargeDf(wb, df, startcell='A1', chunk_size=50000):
    # Dumps a large DataFrame in Excel via xlwings. Takes care of header.
    if len(df) <= (chunk_size + 1):
        wb.sheets.active.range(startcell).options(index=False).value = df
    else:                                       # Chunk df and and dump each
        c = re.match(r"([a-z]+)([0-9]+)", startcell, re.I)      # A1
        row = c.group(1)                                        # A
        col = int(c.group(2))                                   # 1
        useHeader = True
        for chunk in (df[rw:rw + chunk_size] for rw in
                      range(0, len(df), chunk_size)):
            print("Dumping chunk in %s%s" % (row, col))
            wb.sheets.active.range(row + str(col)) \
                .options(index=False, header=useHeader).value = chunk
            useHeader = False
            col += chunk_size