SQL-如何查询较大结果集中的列的最新值

时间:2019-06-27 18:41:57

标签: mysql sql

鉴于此架构包含源数据

CREATE TABLE `sourceData` (
  `eventDate` datetime NOT NULL,
  `eventType` varchar(255) DEFAULT NULL,
  `eventId` varchar(255) NOT NULL,
  `eventDescription` varchar(255) NOT NULL,
  `visitorId` varchar(255) NOT NULL,
  `acountId` varchar(255) NOT NULL,
  `eventCount` int(11) DEFAULT NULL,
  `minutesOnPage` int(11) DEFAULT NULL,
  `urlParameter` varchar(255) NOT NULL,
  `visitorIpAddress` varchar(255) NOT NULL,
  `eventDomain` varchar(255) NOT NULL,
  `vistorUserAgent` text,
  PRIMARY KEY (`eventDate`,`eventId`,`visitorId`,`accountId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

使用此数据,我想创建另一个表,其中包含每个eventId的eventDescription的最新值。我创建了以下查询来获取此结果:

SELECT max(eventDate) as maxEventDate, eventDescription, eventId 
FROM sourceData 
GROUP BY eventId, eventDescription 
ORDER by eventId;

我对此的预期结果是每个eventId只有1条记录。但是,我为每个eventId获得多个记录。关于如何调整查询以获得所需结果的任何建议?

+--------------+----------------------------------------------------------------------+-----------------------------+
| maxEventDate | eventDescription                                                     | eventId                     |
+--------------+----------------------------------------------------------------------+-----------------------------+
| 2019-06-25   | Settings - User Settings - New Password                              | _2SkneHp0dkIlHef52uPRGKaf34 |
| 2019-06-10   | Settings - User Settings_New Password                                | _2SkneHp0dkIlHef52uPRGKaf34 |
| 2019-06-21   | User Settings - New Password                                         | _2SkneHp0dkIlHef52uPRGKaf34 |
| 2019-06-04   | Offer Tab_Makegood Missed Spots - Preempt(s)_Show All Buyer Demos    | _3YDY4OVlw-L2OVSZruGcwATEcI |
| 2019-06-27   | Campaign Performance Details - Spot Detail - Back to Top             | _4_61DOJgg2J6y0wGleGeu30J4w |
| 2019-06-21   | Spot Detail - Back to Top                                            | _4_61DOJgg2J6y0wGleGeu30J4w 

3 个答案:

答案 0 :(得分:0)

如果您的dbms支持row_number(),则可以尝试以下操作

 select * from (SELECT *,
row_number() over(partition by eventId order by eventDate desc) rn
 FROM sourceData ) a where a,rn=1

答案 1 :(得分:0)

您可以尝试将带有子查询的联接用于最大事件日期

select * 
from sourceData s 
inner join  (
  SELECT max(eventDate) as maxEventDate,eventId 
  FROM sourceData 
  GROUP BY eventId 
  ) t on t.maxEventDate = s.eventDate and t.eventId = s.eventId

,如果您需要不同的结果,请添加DISCTINCT子句

答案 2 :(得分:0)

一种简单有效的方法使用相关子查询:

select sd.*
from sourceData sd
where sd.eventId = (select sd2.eventDate
                    from sourceData sd2
                    where sd2.eventId = sd.eventId
                    order by sd2.eventDate desc
                    limit 1
                   );

此特定公式假设每个事件ID的事件日期都是唯一的。为了提高性能,您希望在(eventId, eventDate desc)上建立索引-最好在desc上建立索引。