动态数据集和SQLAlchemy

时间:2015-07-09 18:49:45

标签: python sqlalchemy metaprogramming

我在Python中将一些旧的SQLite3 SQL语句重构为SQLAlchemy。在我们的框架中,我们有以下SQL语句,它们接收带有某些已知密钥的dict,并且可能包含任意数量的意外密钥和值(取决于提供的信息)。

import sqlite3
import sys

def dict_factory(cursor, row):
    d = {}
    for idx, col in enumerate(cursor.description):
        d[col[0]] = row[idx]
    return d


def Create_DB(db):
    #    Delete the database
    from os import remove
    remove(db)

#   Recreate it and format it as needed
    with sqlite3.connect(db) as conn:
        conn.row_factory = dict_factory
        conn.text_factory = str

        cursor = conn.cursor()

        cursor.execute("CREATE TABLE [Listings] ([ID] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE, [timestamp] REAL NOT NULL DEFAULT(( datetime ( 'now' , 'localtime' ) )), [make] VARCHAR, [model] VARCHAR, [year] INTEGER);")


def Add_Record(db, data):
    with sqlite3.connect(db) as conn:
        conn.row_factory = dict_factory
        conn.text_factory = str

        cursor = conn.cursor()

        #get column names already in table
        cursor.execute("SELECT * FROM 'Listings'")
        col_names = list(map(lambda x: x[0], cursor.description))

        #check if column doesn't exist in table, then add it
        for i in data.keys():
            if i not in col_names:
                cursor.execute("ALTER TABLE 'Listings' ADD COLUMN '{col}' {type}".format(col=i, type='INT' if type(data[i]) is int else 'VARCHAR'))

        #Insert record into table
        cursor.execute("INSERT INTO Listings({cols}) VALUES({vals});".format(cols = str(data.keys()).strip('[]'), 
                    vals=str([data[i] for i in data]).strip('[]')
                    ))

#Database filename
db = 'test.db'

Create_DB(db)

data = {'make': 'Chevy',
    'model' : 'Corvette',
    'year' : 1964,
    'price' : 50000,
    'color' : 'blue',
    'doors' : 2}
Add_Record(db, data)

data = {'make': 'Chevy',
    'model' : 'Camaro',
    'year' : 1967,
    'price' : 62500,
    'condition' : 'excellent'}
Add_Record(db, data)

这种动态水平是必要的,因为我们无法知道将提供哪些额外信息,但是,无论如何,我们存储提供给我们的所有信息都很重要。这从来就不是问题,因为在我们的框架中,因为我们从未预料到表格中的列数量很大。

虽然上面的代码有效,但很明显它不是一个干净的实现,因此我试图将它重构为SQLAlchemy更清晰,更强大的ORM范例。我开始阅读SQLAlchemy的官方教程和各种示例,并得出以下代码:

from sqlalchemy import Column, String, Integer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

Base = declarative_base()

class Listing(Base):
    __tablename__ = 'Listings'
    id = Column(Integer, primary_key=True)
    make = Column(String)
    model = Column(String)
    year = Column(Integer)

engine = create_engine('sqlite:///')

session = sessionmaker()
session.configure(bind=engine)
Base.metadata.create_all(engine)

data = {'make':'Chevy',
    'model' : 'Corvette',
    'year' : 1964}

record = Listing(**data)

s = session()
s.add(record)
s.commit()
s.close()

它可以很好地处理数据。现在,当我添加一个新关键字时,例如

data = {'make':'Chevy',
'model' : 'Corvette',
'year' : 1964,
'price' : 50000}

我收到TypeError: 'price' is an invalid keyword argument for Listing错误。为了尝试解决这个问题,我将类修改为动态:

class Listing(Base):
    __tablename__ = 'Listings'
    id = Column(Integer, primary_key=True)
    make = Column(String)
    model = Column(String)
    year = Column(Integer)

    def __checker__(self, data):
        for i in data.keys():
            if i not in [a for a in dir(self) if not a.startswith('__')]:
                if type(i) is int:
                    setattr(self, i, Column(Integer))
                else:
                    setattr(self, i, Column(String))
            else:
                self[i] = data[i]

但我很快意识到这根本无法解决,原因如下:该类已经初始化,数据字典无法在不重新初始化的情况下输入到类中,这比任何事情都要糟糕,等等。我想的越多,使用SQLAlchemy的解决方案就越不明显。所以,我的主要问题是,如何使用SQLAlchemy实现这种动态性?

我研究了一下是否有人有类似的问题。我发现的最接近的是Dynamic Class Creation in SQLAlchemy,但它只讨论了常量属性(“ tablename ”等)。我相信未答复的https://stackoverflow.com/questions/29105206/sqlalchemy-dynamic-attribute-change可能会问同样的问题。虽然Python不是我的专长,但我认为自己是上下文科学/工程应用程序中非常熟练的程序员(C ++和JavaScript是我最强的语言),所以我可能不会在我的搜索中使用正确的Python特定关键字。

我欢迎任何帮助。

1 个答案:

答案 0 :(得分:1)

class Listing(Base):
    __tablename__ = 'Listings'
    id = Column(Integer, primary_key=True)
    make = Column(String)
    model = Column(String)
    year = Column(Integer)
    def __init__(self,**kwargs):
       for k,v in kwargs.items():
           if hasattr(self,k):
              setattr(self,k,v)
           else:
              engine.execute("ALTER TABLE %s AD COLUMN %s"%(self.__tablename__,k)
              setattr(self.__class__,Column(k, String))
              setattr(self,k,v)

可能有用...也许......我不完全确定我没有测试它

更好的解决方案是使用关系表

class Attribs(Base):
    listing_id = Column(Integer,ForeignKey("Listing"))
    name = Column(String)
    val = Column(String)

class Listing(Base):
    id = Column(Integer,primary_key = True)
    attributes = relationship("Attribs",backref="listing")
    def __init__(self,**kwargs):
        for k,v in kwargs.items():
            Attribs(listing_id=self.id,name=k,value=v)
    def __str__(self):
        return "\n".join(["A LISTING",] + ["%s:%s"%(a.name,a.val) for a in self.attribs])

另一个解决方案是存储json

class Listing(Base):
    __tablename__ = 'Listings'
    id = Column(Integer, primary_key=True)
    data = Column(String)
    def __init__(self,**kwargs):
       self.data = json.dumps(kwargs)
       self.data_dict = kwargs

最好的解决方案是使用no-sql密钥,值存储(甚至可能只是一个简单的json文件?或者可能搁置?甚至是pickle我猜)