Question

我用pandas创建了一个数据库：

import numpy as np                                                                                                                                                                                          
import sqlite3                                                                                                                                                                                              
import pandas as pd                                                                                                                                                                                         
import sqlite3                                                                                                                                                                                              
import sqlalchemy                                                                                                                                                                                           
from sqlalchemy import create_engine                                                                                                                                                                        
from sqlalchemy.orm import sessionmaker                                                                                                                                                                     

df = pd.DataFrame(np.random.normal(0, 1, (10, 2)), columns=['A', 'B'])                                                                                                                                      

path = 'sqlite:////home/username/Desktop/example.db'                                                                                                                                                        

engine = create_engine(path, echo=False)                                                                                                                                                                    

df.to_sql('flows', engine, if_exists='append', index=False)                                                                                                                                                 

# This is only to show I am able to read the database                                                                                                                                                                                                            
df_l = pd.read_sql("SELECT * FROM flows WHERE A>0 AND B<0", engine)

现在我想向数据库添加一个或多个索引。在这种情况下，我想首先只列A列，然后是列索引。

我该怎么做？

如果可能，我想要一个仅使用SqlAlchemy的解决方案，以便它独立于数据库的选择。

Answer 1

您应该使用反射来掌握pandas为您创建的表。

参考：

SQLAlchemy Reflecting Database Objects

可以指示Table对象从中加载有关其自身的信息相应的数据库模式对象已存在于其中数据库。这个过程叫做反射。在最简单的情况下您只需要指定表名，MetaData对象和 autoload =真旗。如果MetaData不是持久绑定的话添加autoload_with参数：

你可以试试这个：

meta = sqlalchemy.MetaData()
meta.reflect(bind=engine)
flows = meta.tables['flows']
# alternative of retrieving the table from meta:
#flows = sqlalchemy.Table('flows', meta, autoload=True, autoload_with=engine)

my_index = sqlalchemy.Index('flows_idx', flows.columns.get('A'))
my_index.create(bind=engine)

# lets confirm it is there
inspector = reflection.Inspector.from_engine(engine)
print(inspector.get_indexes('flows'))

sqlalchemy为现有的sqlite3数据库添加索引

1 个答案: