编写自联接SQL的更简洁的方法是什么

时间:2014-02-22 06:32:35

标签: mysql sql self-join

我想仅显示“银行”表中每种帐户类型的最后5笔交易。银行表的结构是:

CREATE TABLE bank(
bnk_id INT(11) AUTO_INCREMENT PRIMARY KEY NOT NULL,
......
bnk_acc_id INT(11) NOT NULL
)

我现在开始工作的方式是必须创建一个临时表,如下所示

CREATE TABLE B1 AS 
SELECT bnk_id FROM bank WHERE bnk_acc_id=1 ORDER BY bnk_date DESC LIMIT 5;

CREATE TABLE B2 AS 
SELECT bnk_id FROM bank WHERE bnk_acc_id=2 ORDER BY bnk_date DESC LIMIT 5;

然后我会运行以下查询

SELECT * 
  FROM bank
 WHERE bnk_id IN (SELECT * FROM B1)
    OR bnk_id IN (SELECT * FROM B2)

顺便提一下,有6种不同的帐户类型(在表格中表示为bnk_acc_id) 我认为有一种更有效的方法来编写这个SQL语句。请给我一个建议。

3 个答案:

答案 0 :(得分:3)

这消除了额外的临时表。

SELECT * FROM bank WHERE bnk_acc_id=1 ORDER BY bnk_date DESC LIMIT 5;
UNION ALL
SELECT * FROM bank WHERE bnk_acc_id=2 ORDER BY bnk_date DESC LIMIT 5;

答案 1 :(得分:0)

您需要将数据划分为每个帐户的行组。然后,您可以使用此分组数据检索每个组中的前n个。这意味着您不必在每个选择中指定帐户ID。

以下是使用临时表为您保存一些示例数据的示例:

CREATE TABLE [#bank]
(
    bnk_id INT IDENTITY(1, 1) PRIMARY KEY NOT NULL,
    bnk_acc_id INT NOT NULL,
    bnk_date datetime NOT NULL
)

INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES (1, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(1, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(1, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(1, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(1, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(1, getutcdate())

INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(2, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(2, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(2, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(2, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(2, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(2, getutcdate())


INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(3, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(3, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(3, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(3, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(3, getutcdate())
INSERT INTO [#bank] (bnk_acc_id, bnk_date) VALUES(3, getutcdate())


GO
WITH [Grouped] AS (
SELECT bnk_id
     , bnk_acc_id
     , bnk_date
     , ROW_NUMBER()
     OVER (
       PARTITION BY [bnk_acc_id]
       ORDER BY [bnk_date] DESC
     ) [RowInGroup]
FROM [#bank]
)

SELECT * FROM [Grouped]
        WHERE [RowInGroup] <= 5


DROP TABLE [#bank]

主要部分是

WITH [Grouped] AS (
SELECT bnk_id
     , bnk_acc_id
     , bnk_date
     , ROW_NUMBER()
     OVER (
       PARTITION BY [bnk_acc_id]
       ORDER BY [bnk_date] DESC
     ) [RowInGroup]
FROM [#bank]
)

SELECT * FROM [Grouped]
        WHERE [RowInGroup] <= 5

这将创建您的groupe数据,然后相应地过滤它。

有关MSDN分区的更多信息:

http://technet.microsoft.com/en-us/library/ms186734.aspx

答案 2 :(得分:0)

根据您的数据库,您可以访问window functions。例如,在PostgreSQL中,您可以将此查询编写为

SELECT *
FROM (
    SELECT *, row_number() OVER (PARTITION BY bnk_acc_id 
                                 ORDER BY bnk_date DESC) rn
    FROM bank
) AS b
WHERE rn <= 5
ORDER BY bnk_acc_id, bnk_date DESC

这可以通过为每个由相关银行分区并按日期排序的行分配行号,然后过滤到每个银行的前5行编号。

从你的SQL示例中,看起来你正在使用MySQL,遗憾的是它没有窗口函数。在这种情况下,Gideon Wise的回答可能是您最好的选择。此外,MySQL确实有user defined variables可用于模拟窗口函数,如博客文章Analytic functions: FIRST_VALUE, LAST_VALUE, LEAD, LAG中所述。但是,这可能对您的需求来说效率太低,因为您可能最终会进行全表扫描。