Question

我想构建SQL查询以传入spark-redshift读者的“查询”选项。我正在尝试使用psycopg2，所以我做了类似的事情：

from psycopg2 import sql

query = sql.SQL(
    "select * from {} where event_timestamp < {}"
).format(
    sql.Identifier("events"),
    sql.Literal(datetime.now())
).as_string()

但它告诉我，我需要将上下文（连接或游标）传递给as_string()。我无法，因为我没有任何联系。

在这种情况下我应该使用纯字符串格式吗？有些转义吗？

或者有没有办法在那里传递一些模拟上下文？为什么它需要连接来构建查询字符串？ SQL查询是否会根据连接而改变？

Answer 1

我不熟悉spark，但如果他们没有某种sql支持，我会感到惊讶。另一种选择是像sqlbuilder这样的轻量级包。

如果你真的想使用psycopg，我建议看看他们如何使用模拟进行单元测试 - psycopg2's ConnectingTestCase。

class ConnectingTestCase(unittest.TestCase):
    """A test case providing connections for tests.

    A connection for the test is always available as `self.conn`. Others can be
    created with `self.connect()`. All are closed on tearDown.

    Subclasses needing to customize setUp and tearDown should remember to call
    the base class implementations.
    """

如何在没有连接的情况下为postgres（Redshift）生成SQL查询？

1 个答案: