SQLAlchemy与PostgreSQL和全文搜索

时间:2012-11-13 12:58:50

标签: python postgresql sqlalchemy flask

我正在使用烧瓶,sqlalchemy和flask-sqlalchemy。我想在postgres中使用gin和to_tsvector创建一个完整的测试搜索索引。目前,我正在尝试以下方面。我认为它最接近我想要表达的内容,但不起作用。

from sqlalchemy.ext.declarative import declared_attr
from sqlalchemy.schema import Index
from sqlalchemy.sql.expression import func

from app import db


class Post(db.Model):

    id = db.Column(db.Integer, primary_key=True)
    added = db.Column(db.DateTime, nullable=False)
    pub_date = db.Column(db.DateTime, nullable=True)
    content = db.Column(db.Text)

    @declared_attr
    def __table_args__(cls):
        return (Index('idx_content', func.to_tsvector("english", "content"), postgresql_using="gin"), )

这会引发以下错误......

Traceback (most recent call last):
  File "./manage.py", line 5, in <module>
    from app import app, db
  File "/vagrant/app/__init__.py", line 36, in <module>
    from pep.models import *
  File "/vagrant/pep/models.py", line 8, in <module>
    class Post(db.Model):
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/flask_sqlalchemy.py", line 477, in __init__
    DeclarativeMeta.__init__(self, name, bases, d)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/ext/declarative/api.py", line 48, in __init__
    _as_declarative(cls, classname, cls.__dict__)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/ext/declarative/base.py", line 222, in _as_declarative
    **table_kw)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 326, in __new__
    table._init(name, metadata, *args, **kw)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 393, in _init
    self._init_items(*args)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 63, in _init_items
    item._set_parent_with_dispatch(self)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/events.py", line 235, in _set_parent_with_dispatch
    self._set_parent(parent)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 2321, in _set_parent
    ColumnCollectionMixin._set_parent(self, table)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 1978, in _set_parent
    self.columns.add(col)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/sql/expression.py", line 2391, in add
    self[column.key] = column
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/sql/expression.py", line 2211, in __getattr__
    key)
AttributeError: Neither 'Function' object nor 'Comparator' object has an attribute 'key'

我也试过

return (Index('idx_content', "content", postgresql_using="gin"), )

然而,它不能用作postgres(至少9.1,因为我运行的那个)期望调用to_tsvector。该行创建SQL;

CREATE INDEX content_index ON post USING gin (content)

而不是我想要的东西;

CREATE INDEX content_index ON post USING gin(to_tsvector('english', content))
我打开了一张票,因为我认为这可能是一个错误/限制。 http://www.sqlalchemy.org/trac/ticket/2605

3 个答案:

答案 0 :(得分:4)

目前我已经手动添加了以下几行,但我更倾向于“正确”#39; SQLAlchemy方法,如果有的话。

create_index = DDL("CREATE INDEX idx_content ON pep USING gin(to_tsvector('english', content));")
event.listen(Pep.__table__, 'after_create', create_index.execute_if(dialect='postgresql'))

对SQLAlchemy bug跟踪器进行了一些有趣的讨论。看起来这是当前索引定义的限制。基本上,我的要求是允许索引是表达式而不仅仅是列名,但目前不支持。此票证正在跟踪此功能请求:http://www.sqlalchemy.org/trac/ticket/695。然而,这正在等待开发人员继续前进并完成工作(并且已经有一段时间了)。

答案 1 :(得分:1)

所以在sqlalchemy 0.9及以上版本中,这有效:

class Content(Base, ):
    __tablename__ = 'content'

    id = sa.Column(sa.Integer, primary_key=True)

    description = sa.Column(sa.UnicodeText, nullable=False, server_default='')
    @declared_attr
    def __table_args__(cls):
        return (sa.Index('idx_content',
                     sa.sql.func.to_tsvector("english", cls.description),
                     postgresql_using="gin"), )

值得注意的是,与第一个示例的不同之处在于对列名的直接引用,而不是引号中提供的列名,因为它不起作用。

答案 2 :(得分:0)

我在创建一些单列和多列tsvector GIN索引时遇到了这个老问题。对于正在寻找使用列名的字符串表示形式创建这些索引的简单方法的任何人,这是使用SQLAlchemy text()构造的一种方法。

from sqlalchemy import Column, Index, Integer, String, text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql import func


Base = declarative_base()

def to_tsvector_ix(*columns):
    s = " || ' ' || ".join(columns)
    return func.to_tsvector('english', text(s))

class Example(Base):
    __tablename__ = 'examples'

    id = Column(Integer, primary_key=True)
    atext = Column(String)
    btext = Column(String)

    __table_args__ = (
        Index(
            'ix_examples_tsv',
            to_tsvector_ix('atext', 'btext'),
            postgresql_using='gin'
            ),
        )