Question

我在sqlite中有一个c.300表的数据库。目前，我正在遍历列表并附加数据。

有更快的方式/更多的pythonic方式吗？

df = []
for i in Ave.columns:
    try:
        df2 = get_mcap(i)
        df.append(df2)
        #print (i)
    except:
        pass
df = pd.concat(df, axis=0

Ave是一个数据框，我希望迭代列表中的列。

def get_mcap(Ticker):
    cnx = sqlite3.connect('Market_Cap.db')
    df = pd.read_sql_query("SELECT * FROM '%s'"%(Ticker), cnx)
    df.columns = ['Date', 'Mcap-Ave', 'Mcap-High', 'Mcap-Low']
    df = df.set_index('Date')
    df.index = pd.to_datetime(df.index)
    cnx.close
    return df

Answer 1

在我发布我的解决方案之前，我应该包含一个快速警告，你应该永远不要使用字符串操作来生成SQL查询，除非它绝对不可避免，在这种情况下你需要确定你是在控制数据，这是用于格式化字符串，它不会包含任何会导致查询无意中执行操作的内容。

话虽如此，这似乎是您需要使用字符串格式的情况之一，因为您无法将表名作为参数传递。只要确保用户无法改变表格列表中包含的内容。

解决方案。看起来您可以使用以下方式获取表格列表：

tables = Ave.columns.tolist()

对于我的简单示例，我将使用：

tables = ['table1', 'table2', 'table3']

然后使用以下代码生成单个查询：

query_template = 'select * from {}'
query_parts = []
for table in tables:
    query = query_template.format(table)
    query_parts.append(query)
full_query = ' union all '.join(query_parts)

，并提供：

'select * from table1 union all select * from table2 union all select * from table3'

然后，您只需执行此一个查询即可获得结果：

cnx = sqlite3.connect('Market_Cap.db')
df = pd.read_sql_query(full_query, cnx)

然后从这里你应该能够设置索引，转换为日期时间等，但现在你只需要做一次这些操作而不是300次。我想现在的整体运行时间应该更快。

sqlite选择多个表

1 个答案: