根据同一表中相关行的存在获取行

时间:2018-12-01 23:29:51

标签: sql postgresql sqlalchemy exists

这就是我的数据。如果我执行以下查询:

async for

我得到的回报看起来像这样:

select * from gdax_trades where order_type='limit' limit 5;

表中还有对应于每个 row_id | order_id | price | funds | maker_order_id | taker_order_id | trade_id | product_id | client_oid | reason | remaining_size | size | sequence | side | time | order_type | event_type ---------+--------------------------------------+---------+-------+----------------+----------------+----------+------------+--------------------------------------+--------+----------------+------------+------------+------+-------------------------+------------+------------ 3697499 | 01d63a5b-a5b7-4153-b93d-bd18c249d9c3 | 4113.06 | | | | | BTC-USD | 50028bab-81da-4842-98f0-2a1206669567 | | | 0.01 | 7446101470 | buy | 2018-11-29 04:15:39.047 | limit | received 3697501 | 9295111b-2e23-445c-9f52-52d2f26fb418 | 4131.93 | | | | | BTC-USD | de58f4a6-4577-4680-b083-df34ade6c001 | | | 0.12792387 | 7446101472 | sell | 2018-11-29 04:15:39.071 | limit | received 3697504 | 4c09878d-8bf9-49d7-9fc7-ca81b7da9e42 | 4131.19 | | | | | BTC-USD | a55e0315-8b65-4525-a7a7-debcf6f17bb5 | | | 0.10898271 | 7446101475 | sell | 2018-11-29 04:15:39.155 | limit | received 3697506 | 0a157570-a811-420e-81ff-0ead9cc34984 | 4132.69 | | | | | BTC-USD | 45086077-34be-441e-947f-99fe60bd88ef | | | 0.12146031 | 7446101477 | sell | 2018-11-29 04:15:39.24 | limit | received 3697508 | e8e1d02f-e627-4eac-a2e5-61c08399d6ef | 4117.83 | | | | | BTC-USD | 00000000-818a-0006-0001-000011037107 | | | 0.001 | 7446101479 | sell | 2018-11-29 04:15:39.259 | limit | received (5 rows) 的其他行,但没有order_id。例如,如果我尝试查找与第一个order_type='limit'相对应的所有行:

order_id

我得到:

select * from gdax_trades where order_id='01d63a5b-a5b7-4153-b93d-bd18c249d9c3';

我想要的是一个SQLAlchemy查询,该查询向我返回带有 row_id | order_id | price | funds | maker_order_id | taker_order_id | trade_id | product_id | client_oid | reason | remaining_size | size | sequence | side | time | order_type | event_type ---------+--------------------------------------+---------+-------+----------------+----------------+----------+------------+--------------------------------------+----------+----------------+------+------------+------+-------------------------+------------+------------ 3697499 | 01d63a5b-a5b7-4153-b93d-bd18c249d9c3 | 4113.06 | | | | | BTC-USD | 50028bab-81da-4842-98f0-2a1206669567 | | | 0.01 | 7446101470 | buy | 2018-11-29 04:15:39.047 | limit | received 3697500 | 01d63a5b-a5b7-4153-b93d-bd18c249d9c3 | 4113.06 | | | | | BTC-USD | | | 0.01 | | 7446101471 | buy | 2018-11-29 04:15:39.047 | | open 3697662 | 01d63a5b-a5b7-4153-b93d-bd18c249d9c3 | 4113.06 | | | | | BTC-USD | | canceled | 0.01 | | 7446101633 | buy | 2018-11-29 04:15:40.522 | | done (3 rows) 的行,这些行与“限制”顺序相对应。我尝试进行自我参照联接:

order_id

但是那没有给我想要的结果。有人有什么建议吗?

2 个答案:

答案 0 :(得分:1)

许多方法。我建议使用EXISTS半联接。可能最快,读起来很清楚:

SELECT *
FROM   gdax_trades g
WHERE  EXISTS (
   SELECT FROM gdax_trades
   WHERE  order_type = 'limit'
   AND    order_id = g.order_id
   );

SELECT表达式的EXISTS列表可以保留为空。只有至少一行的 existence 是相关的。

两次访问同一张表时,我们至少需要一个表别名(在示例中为g)。我没有表限定引用子查询中的本地表的列,因为它首先可见。仅将对外部查询的引用限定为g.order_id。这是明确的最低要求。如果需要,可以更加明确。

包括结果中的“限价”订单。您可以通过添加最终字词轻松排除它们:

...
WHERE order_type IS DISTINCT FROM 'limit'

IS DISTINCT FROM,因为order_type似乎可以为空(不清楚样本结果中的''还是NULL)。 WHERE order_type <> 'limit'将排除带有order_type IS NULL的行。

该查询从外部表返回唯一的行,即使存在多个具有相同order_id的“限制”订单。在这种情况下,具有联接或子查询的各种替代查询技术会返回重复项。相关:

答案 1 :(得分:0)

我使用子查询找到了答案。我很好奇人们对此有何看法

    sub_query = ( 
        sess
        .query(GDAXTrade)
        .filter( GDAXTrade.time.between(start_dt, end_dt) )
        .filter(GDAXTrade.order_type=='limit')
        .subquery() 
        )

    orders = (
        sess
        .query(GDAXTrade)
        .join(sub_query, GDAXTrade.order_id==sub_query.c.order_id, isouter=True)
        .filter(GDAXTrade.order_id==sub_query.c.order_id)
        .filter( GDAXTrade.time.between(start_dt, end_dt) )
        .order_by(GDAXTrade.time.asc())
        .all()
        )