如何从表中加入最新的行?

时间:2008-09-30 18:11:36

标签: sql aggregate pivot

我经常遇到这种形式的问题,但还没有找到一个好的解决方案:

假设我们有两个代表电子商务系统的数据库表。

userData (userId, name, ...)
orderData (orderId, userId, orderType, createDate, ...)

对于系统中的所有用户,请选择其用户信息,类型为“1”的最新订单信息,以及类型为“2”的最新订单信息。我想在一个查询中执行此操作。以下是一个示例结果:

(userId, name, ..., orderId1, orderType1, createDate1, ..., orderId2, orderType2, createDate2, ...)
(101, 'Bob', ..., 472, '1', '4/25/2008', ..., 382, '2', '3/2/2008', ...)

11 个答案:

答案 0 :(得分:4)

这应该有效,你必须调整表/列名称:

select ud.name,
       order1.order_id,
       order1.order_type,
       order1.create_date,
       order2.order_id,
       order2.order_type,
       order2.create_date
  from user_data ud,
       order_data order1,
       order_data order2
 where ud.user_id = order1.user_id
   and ud.user_id = order2.user_id
   and order1.order_id = (select max(order_id)
                            from order_data od1
                           where od1.user_id = ud.user_id
                             and od1.order_type = 'Type1')
   and order2.order_id = (select max(order_id)
                             from order_data od2
                            where od2.user_id = ud.user_id
                              and od2.order_type = 'Type2')

对数据进行非规范化也可能是一个好主意。这类事情的成本相当高。因此,您可以向您的userData添加last_order_date

答案 1 :(得分:3)

我提供了三种不同的方法来解决这个问题:

  1. 使用Pivots
  2. 使用案例陈述
  3. 在where子句中使用内联查询
  4. 所有解决方案都假设我们根据orderId列确定“最新”订单。使用createDate列会因时间戳冲突而增加复杂性并严重影响性能,因为createDate可能不是索引键的一部分。我只使用MS SQL Server 2005测试了这些查询,因此我不知道它们是否可以在您的服务器上运行。

    解决方案(1)和(2)几乎完全相同。实际上,它们都会导致数据库中的读取次数相同。

    在处理大型数据集时,解决方案(3)是首选方法。它始终使数百个逻辑读取超过(1)和(2)。当过滤一个特定用户时,方法(3)与其他方法相当。在单用户案例中,cpu时间的下降有助于抵消显着更高的读取次数;但是,随着磁盘驱动器变得更加繁忙并且发生缓存未命中,这种轻微优势将消失。

    结论

    对于所呈现的方案,如果DBMS支持,则使用数据透视方法。它需要的代码少于case语句,并且将来简化添加订单类型。

    请注意,在某些情况下,PIVOT不够灵活,使用案​​例陈述的特征值函数是可行的方法。

    代码

    方法(1)使用PIVOT:

    select 
        ud.userId, ud.fullname, 
        od1.orderId as orderId1, od1.createDate as createDate1, od1.orderType as orderType1,
        od2.orderId as orderId2, od2.createDate as createDate2, od2.orderType as orderType2
    
    from userData ud
        inner join (
                select userId, [1] as typeOne, [2] as typeTwo
                from (select
                    userId, orderType, orderId
                from orderData) as orders
                PIVOT
                (
                    max(orderId)
                    FOR orderType in ([1], [2])
                ) as LatestOrders) as LatestOrders on
            LatestOrders.userId = ud.userId 
        inner join orderData od1 on
            od1.orderId = LatestOrders.typeOne
        inner join orderData od2 on
            od2.orderId = LatestOrders.typeTwo
    

    方法(2)使用案例陈述:

    select 
        ud.userId, ud.fullname, 
        od1.orderId as orderId1, od1.createDate as createDate1, od1.orderType as orderType1,
        od2.orderId as orderId2, od2.createDate as createDate2, od2.orderType as orderType2
    
    from userData ud 
        -- assuming not all users will have orders use outer join
        inner join (
                select 
                    od.userId,
                    -- can be null if no orders for type
                    max (case when orderType = 1 
                            then ORDERID
                            else null
                            end) as maxTypeOneOrderId,
    
                    -- can be null if no orders for type
                    max (case when orderType = 2
                            then ORDERID 
                            else null
                            end) as maxTypeTwoOrderId
                from orderData od
                group by userId) as maxOrderKeys on
            maxOrderKeys.userId = ud.userId
        inner join orderData od1 on
            od1.ORDERID = maxTypeTwoOrderId
        inner join orderData od2 on
            OD2.ORDERID = maxTypeTwoOrderId
    

    方法(3)在where子句中使用内联查询(基于Steve K.的回复):

    select  ud.userId,ud.fullname, 
            order1.orderId, order1.orderType, order1.createDate, 
            order2.orderId, order2.orderType, order2.createDate
      from userData ud,
           orderData order1,
           orderData order2
     where ud.userId = order1.userId
       and ud.userId = order2.userId
       and order1.orderId = (select max(orderId)
                                from orderData od1
                               where od1.userId = ud.userId
                                 and od1.orderType = 1)
       and order2.orderId = (select max(orderId)
                                 from orderData od2
                                where od2.userId = ud.userId
                                  and od2.orderType = 2)
    

    用于生成表和1000个用户的脚本,每个用户100个订单:

    CREATE TABLE [dbo].[orderData](
        [orderId] [int] IDENTITY(1,1) NOT NULL,
        [createDate] [datetime] NOT NULL,
        [orderType] [tinyint] NOT NULL, 
        [userId] [int] NOT NULL
    ) 
    
    CREATE TABLE [dbo].[userData](
        [userId] [int] IDENTITY(1,1) NOT NULL,
        [fullname] [nvarchar](50) NOT NULL
    ) 
    
    -- Create 1000 users with 100 order each
    declare @userId int
    declare @usersAdded int
    set @usersAdded = 0
    
    while @usersAdded < 1000
    begin
        insert into userData (fullname) values ('Mario' + ltrim(str(@usersAdded)))
        set @userId = @@identity
    
        declare @orderSetsAdded int
        set @orderSetsAdded = 0
        while @orderSetsAdded < 10
        begin
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-06-08', 1)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-02-08', 1)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-08-08', 1)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-09-08', 1)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-01-08', 1)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-06-06', 2)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-02-02', 2)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-08-09', 2)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-09-01', 2)
            insert into orderData (userId, createDate, orderType) 
                values ( @userId, '01-01-04', 2)
    
            set @orderSetsAdded = @orderSetsAdded + 1
        end
        set @usersAdded = @usersAdded + 1
    end
    

    除了SQL事件探查器之外,用于测试MS SQL Server上的查询性能的小代码片段:

    -- Uncomment these to clear some caches
    --DBCC DROPCLEANBUFFERS
    --DBCC FREEPROCCACHE
    
    set statistics io on
    set statistics time on
    
    -- INSERT TEST QUERY HERE
    
    set statistics time off
    set statistics io off
    

答案 2 :(得分:1)

抱歉,我面前没有oracle,但这是我在oracle中所做的基本结构:

SELECT b.user_id, b.orderid, b.orderType, b.createDate, <etc>,
       a.name
FROM orderData b, userData a
WHERE a.userid = b.userid
AND (b.userid, b.orderType, b.createDate) IN (
  SELECT userid, orderType, max(createDate) 
  FROM orderData 
  WHERE orderType IN (1,2)
  GROUP BY userid, orderType) 

答案 3 :(得分:1)

T-SQL示例解决方案(MS SQL):

SELECT
    u.*
    , o1.*
    , o2.* 
FROM
(
    SELECT
        , userData.*
        , (SELECT TOP 1 orderId.url FROM orderData WHERE orderData.userId=userData.userId AND orderType=1 ORDER BY createDate DESC)
            AS order1Id
        , (SELECT TOP 1 orderId.url FROM orderData WHERE orderData.userId=userData.userId AND orderType=2 ORDER BY createDate DESC)
            AS order2Id
    FROM userData
) AS u
LEFT JOIN orderData o1 ON (u.order1Id=o1.orderId)
LEFT JOIN orderData o2 ON (u.order2Id=o2.orderId)

在SQL 2005中,您还可以使用RANK()OVER函数。 (但AFAIK完全是MSSQL特有的功能)

答案 4 :(得分:0)

他们最新的你是指当天的全新内容吗?如果createDate&gt; =当前日期,您可以随时查看您的createDate并获取所有用户和订单数据。

SELECT * FROM
"orderData", "userData"
WHERE
"userData"."userId"  ="orderData"."userId"
AND "orderData".createDate >= current_date;

<强>已更新

以下是您在评论后想要的内容:

SELECT * FROM
"orderData", "userData"
WHERE
"userData"."userId"  ="orderData"."userId"
AND "orderData".type = '1'
AND "orderData"."orderId" = (
SELECT "orderId" FROM "orderData"
WHERE 
"orderType" = '1'
ORDER "orderId" DESC
LIMIT 1

答案 5 :(得分:0)

您可以为此进行联合查询。确切的语法需要一些工作,特别是逐个部分,但联盟应该能够做到。

例如:

SELECT orderId, orderType, createDate
FROM orderData
WHERE type=1 AND MAX(createDate)
GROUP BY orderId, orderType, createDate

UNION

SELECT orderId, orderType, createDate
FROM orderData
WHERE type=2 AND MAX(createDate)
GROUP BY orderId, orderType, createDate

答案 6 :(得分:0)

我在MySQL中使用这样的东西:

SELECT
   u.*,
   SUBSTRING_INDEX( MAX( CONCAT( o1.createDate, '##', o1.otherfield)), '##', -1) as o2_orderfield,
   SUBSTRING_INDEX( MAX( CONCAT( o2.createDate, '##', o2.otherfield)), '##', -1) as o2_orderfield
FROM
   userData as u
   LEFT JOIN orderData AS o1 ON (o1.userId=u.userId AND o1.orderType=1)
   LEFT JOIN orderData AS o2 ON (o1.userId=u.userId AND o2.orderType=2)
GROUP BY u.userId

简而言之,使用MAX()来获得最新的,通过将条件字段(createDate)添加到感兴趣的字段(otherfield)。 SUBSTRING_INDEX()然后删除日期。

OTOH,如果您需要任意数量的订单(如果userType可以是任何数字,而不是有限的ENUM);处理单独的查询会更好,如下所示:

select * from orderData where userId=XXX order by orderType, date desc group by orderType

为每个用户。

答案 7 :(得分:0)

假设orderId随时间单调增加:

SELECT *
FROM userData u
INNER JOIN orderData o
  ON o.userId = u.userId
INNER JOIN ( -- This subquery gives the last order of each type for each customer
  SELECT MAX(o2.orderId)
    --, o2.userId -- optional - include if joining for a particular customer
    --, o2.orderType -- optional - include if joining for a particular type
  FROM orderData o2
  GROUP BY o2.userId
    ,o2.orderType
) AS LastOrders
  ON LastOrders.orderId = o.orderId -- expand join to include customer or type if desired

然后在客户端转动,或者如果使用SQL Server,则有一个PIVOT功能

答案 8 :(得分:0)

以下是将类型1和2数据移动到同一行的一种方法:
(通过将类型1和类型2信息放入它们自己的选择中,然后在from子句中使用。)

SELECT
  a.name, ud1.*, ud2.*
FROM
    userData a,
    (SELECT user_id, orderid, orderType, reateDate, <etc>,
    FROM orderData b
    WHERE (userid, orderType, createDate) IN (
      SELECT userid, orderType, max(createDate) 
      FROM orderData 
      WHERE orderType = 1
      GROUP BY userid, orderType) ud1,
    (SELECT user_id, orderid, orderType, createDate, <etc>,
    FROM orderData 
    WHERE (userid, orderType, createDate) IN (
      SELECT userid, orderType, max(createDate) 
      FROM orderData 
      WHERE orderType = 2
      GROUP BY userid, orderType) ud2

答案 9 :(得分:0)

我是这样做的。这是标准SQL,适用于任何品牌的数据库。

SELECT u.userId, u.name, o1.orderId, o1.orderType, o1.createDate,
  o2.orderId, o2.orderType, o2.createDate
FROM userData AS u
  LEFT OUTER JOIN (
    SELECT o1a.orderId, o1a.userId, o1a.orderType, o1a.createDate
    FROM orderData AS o1a 
      LEFT OUTER JOIN orderData AS o1b ON (o1a.userId = o1b.userId 
        AND o1a.orderType = o1b.orderType AND o1a.createDate < o1b.createDate)
    WHERE o1a.orderType = 1 AND o1b.orderId IS NULL) AS o1 ON (u.userId = o1.userId)
  LEFT OUTER JOIN (
    SELECT o2a.orderId, o2a.userId, o2a.orderType, o2a.createDate
    FROM orderData AS o2a 
      LEFT OUTER JOIN orderData AS o2b ON (o2a.userId = o2b.userId 
        AND o2a.orderType = o2b.orderType AND o2a.createDate < o2b.createDate)
    WHERE o2a.orderType = 2 AND o2b.orderId IS NULL) o2 ON (u.userId = o2.userId);

请注意,如果您有多个日期等于最新日期的类型的订单,您将在结果集中获得多行。如果您有两种类型的多个订单,您将在结果集中获得N x M行。所以我建议你在不同的查询中获取每种类型的行。

答案 10 :(得分:0)

史蒂夫K绝对是对的,谢谢!我确实重写了他的答案,以解释某个特定类型可能没有订单的事实(我没有提及,所以我不能错过Steve K。)

以下是我最终使用的内容:

select ud.name,
       order1.orderId,
       order1.orderType,
       order1.createDate,
       order2.orderId,
       order2.orderType,
       order2.createDate
  from userData ud
  left join orderData order1
   on order1.orderId = (select max(orderId)
                            from orderData od1
                           where od1.userId = ud.userId
                             and od1.orderType = '1')
  left join orderData order2
   on order2.orderId = (select max(orderId)
                            from orderData od2
                           where od2.userId = ud.userId
                             and od2.orderType = '2')
 where ...[some limiting factors on the selection of users]...;