Sqlite和Excel数据透视表 - ODBC

时间:2014-11-08 00:56:45

标签: python excel sqlite excel-vba odbc vba

我在从sqlite3数据库创建数据透视表时遇到一些问题。我在这里使用ODBC驱动程序建立了sqlite连接: http://www.ch-werner.de/sqliteodbc/

在我与我的sqlite数据库文件建立excel连接之后,我可以使数据透视表没有问题 SO LONG ,因为数据库不包含太多记录。它将使用1,000,000行,但当我尝试加载2,000,000行数据库时,Excel会给我错误。

下面是一些示例Pyhton代码,它生成一个包含1,000,000条记录的sqlite3数据库(工作正常)但是当我创建一个2,000,000行数据库(将Number_of_Loops = 10更改为Number_of_Loops = 20)时,Excel' s状态栏说"等待查询执行"然后抛出这些错误:

No microsoft, this was not helpful whatsoever

Not that helpful either...

import sqlite3
import pandas as pd
import numpy as np

# Open an sqlite connection
db = sqlite3.connect("MyData.sqlite")
cursor = db.cursor()

# Create the table where were gonna put all the data
cursor.execute('''CREATE TABLE MyData('Date/Time' TIMESTAMP,
                                        'col1' REAL,
                                        'col2' REAL,
                                        'col3' REAL,
                                        'col4' REAL,
                                        'col5' REAL,
                                        'col6' REAL,
                                        'col7' REAL,
                                        'col8' REAL,
                                        'col9' REAL,
                                        'col10' REAL)
                ''')
# Commit the table creation
db.commit()


# Generate data to put into the table
Number_of_Loops = 10 # Number of loops to go through (total number of records = Number_of_Loops*Records_in_Each_Loop)
Records_in_Each_Loop = 100000 # Number of records to make per loop
for i in range(Number_of_Loops):
    print i
    # How many records to send to the db at once?
    Dateindex = pd.date_range('2000-1-1', periods=Records_in_Each_Loop, freq='H')
    Data = {'col1': np.random.randn(Records_in_Each_Loop),
            'col2': np.random.randn(Records_in_Each_Loop),
            'col3': np.random.randn(Records_in_Each_Loop),
            'col4': np.random.randn(Records_in_Each_Loop),
            'col5': np.random.randn(Records_in_Each_Loop),
            'col6': np.random.randn(Records_in_Each_Loop),
            'col7': np.random.randn(Records_in_Each_Loop),
            'col8': np.random.randn(Records_in_Each_Loop),
            'col9': np.random.randn(Records_in_Each_Loop),
            'col10': np.random.randn(Records_in_Each_Loop)}
    # Create a dataframe
    df = pd.DataFrame(Data, index = Dateindex)

    # Have to convert the time stamps into strings for the database
    df['Date/Time'] = df.index
    df['Date/Time'] = df['Date/Time'].apply(str)


    # Send the data to the sqlite database
    df.to_sql(name='MyData', con=db, if_exists='append',index=False)


# Close the database connection
db.close()

我在这里有点不知所措......任何人都知道可能导致这种事情的原因是什么?

1 个答案:

答案 0 :(得分:0)

Excel对数据透视表的大小强制limits(看看列表的一半)。

其中一个是每个字段不能有超过1,024,576个唯一项目(这相当于Excel电子表格中的最大行数:这可能不是巧合!)。