使用Python将数据覆盖到现有工作簿

时间:2017-06-14 23:12:51

标签: python excel openpyxl xlwings

我是Python的新手,正在开发一个我可以使用一些帮助的项目。因此,我尝试修改现有的Excel工作簿以比较库存数据。幸运的是,有一个在线程序可以检索我需要的所有数据,并且我已成功地获取数据并将数据写入新的excel文件。但是,目标是提取数据并将其放入现有的Excel文件中。此外,我需要覆盖现有文件中的单元格值。我相信xlwings能够做到这一点,我认为我的代码是在正确的轨道上,但我遇到了意外的错误。我得到的错误是:

TypeError: Objects of type 'Period' can not be converted to a COM VARIANT (but obtaining the buffer() of this object could)

我想知道是否有人知道为什么会出现这个错误?此外,有谁知道如何解决它?它可以修复吗?我的代码错了吗?任何帮助或指导表示赞赏。谢谢。

import good_morning as gm
import pandas as pd
import xlwings as xw

#import income statement, balance sheet, and cash flow of AAPL
fd = gm.FinancialsDownloader()
fd_frames = fd.download('AAPL')

#Creates a DataFrame for only the balance sheet
df1 = pd.DataFrame(list(fd_frames.values())[0])

#connects to workbook I want to modify 
wb = xw.Book(r'C:\Users\vince\Project\Spreadsheet.xlsm')

#sheet I would like to modify
sht = wb.sheets[1]

#modifies & overwrites values in my spreadsheet(this is where I get the type_error)
sht.range('M6').value = df1

数据类型

type(fd_frames)
>>> <class 'dict'>
fd_frames.values())[0].info()
>>> <class 'pandas.core.frame.DataFrame'> 
RangeIndex: 22 entries, 0 to 21 
Data columns (total 8 columns): 
parent_index 22 non-null int64 
title 22 non-null object 
2012 19 non-null float64 
2013 20 non-null float64 
2014 20 non-null float64 
2015 20 non-null float64 
2016 20 non-null float64 
2017 20 non-null float64 
dtypes: float64(6), int64(1), object(1) 
memory usage: 1.5+ KB

2 个答案:

答案 0 :(得分:0)

  

评论:您有pandas.DataFrame Dict

使用list(fd_frames.values())[0]从词典中选择会导致不可预测的结果。显示Dict的键并使用这些键选择您感兴趣的键,例如:

 print(fd_frames.keys())
 >>> dict_keys(['key_1', 'key_2', 'key_n']
 df_2 = fd_frames['key_2']

除此之外,pandas.DataFrame中的维度都不匹配M6:M30 = 25. 20列只有 8列 。因此,您必须将工作表范围与 20行对齐。将Column 2017写入工作表,例如:

wb['M6:M25'] = df_2['2017'].values
  

注意:我已更新以下代码以接受numpy.ndarray

  

问题:...目标是提取数据并将其放入现有的Excel文件中

使用列表值更新工作簿工作表范围 使用:OpenPyXL A Python library to read/write Excel 2010 xlsx/xlsm files

  

注意:观察必须如何安排列表值!
   param values:List:* [row 1(col1,...,coln),...,row n(col1,...,coln)]`

from openpyxl import Workbook, load_workbook

class UpdateWorkbook(object):
    def __init__(self, fname, worksheet=0):
        self.fname = fname
        self.wb = load_workbook(fname)
        self.ws = self.wb.worksheets[worksheet]

    def save(self):
        self.wb.save(self.fname)

    def __setitem__(self, _range, values):
        """
         Assign Values to a Worksheet Range
        :param _range:  String e.g ['M6:M30']
        :param values: List: [row 1(col1, ... ,coln), ..., row n(col1, ... ,coln)]
        :return: None
        """

        def _gen_value():
            for value in values:
                yield value

            if not isinstance(values, (list, numpy.ndarray)):
                raise ValueError('Values Type Error: Values have to be "list": values={}'.
                                  format(type(values)))
            if isinstance(values, numpy.ndarray) and values.ndim > 1:
                raise ValueError('Values Type Error: Values of Type numpy.ndarray must have ndim=1; values.ndim={}'.
                                  format(values.ndim))

        from openpyxl.utils import range_boundaries
        min_col, min_row, max_col, max_row = range_boundaries(_range)
        cols = ((max_col - min_col)+1)
        rows = ((max_row - min_row)+1)
        if cols * rows != len(values):
            raise ValueError('Number of List Values:{} does not match Range({}):{}'.
                             format(len(values), _range, cols * rows))

        value = _gen_value()
        for row_cells in self.ws.iter_rows(min_col=min_col, min_row=min_row,
                                           max_col=max_col, max_row=max_row):
            for cell in row_cells:
                cell.value = value.__next__()
  

用法

wb = UpdateWorkbook(r'C:\Users\vince\Project\Spreadsheet.xlsx', worksheet=1)
df_2 = fd_frames['key_2']
wb['M6:M25'] = df_2['2017'].values
wb.save()

使用Python测试:3.4.2 - openpyxl:2.4.1 - LibreOffice:4.3.3.2

答案 1 :(得分:0)

以下是我对其他Stack Explorer进行类似操作的方法:

import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows

... create your pandas dataframe df...

# Writing from pandas back to an existing EXCEL workbook
# Load workbook
wb = load_workbook(filename=target, read_only=False, keep_vba=True)
ws = wb['Sheet1']

# Overwrite Existing data in sheet with a dataframe.
rows = dataframe_to_rows(df, index=False, header=True)

for r_idx, row in enumerate(rows, 1):
    for c_idx, value in enumerate(row, 1):
         ws.cell(row=r_idx, column=c_idx, value=value)

# Save file
wb.save('outfile.xlsm')