从3个表中选择数据,按最新日期值和其他值进行分组

时间:2018-01-10 23:31:53

标签: sql sql-server

好的,所以我已经绞尽脑汁待了一会儿,我觉得是时候问集体了!

我正在使用SQLServer,我有3个表,定义如下:

VolumeData

__________________________
| dataid | currentReading|
--------------------------
|   1    |      22       |
|   7    |      33       |
|   9    |      25       |
|   12   |      12       |
--------------------------

LatestData

________________________________________________________________
| dataid | unitNumber | unitLocation |      dateTimeStamp      |
----------------------------------------------------------------
|   1    |  2344454   |      2       | 2017-07-10 13:16:29.000 |
|   7    |  2344451   |     44       | 2017-07-10 13:22:29.000 |
|   9    |  2344456   |     92       | 2017-07-10 12:16:29.000 |
|   12   |  2344456   |     12       | 2017-07-10 12:13:23.000 |
----------------------------------------------------------------

unitData

____________________________________________________________________________________
| unitNumber | unitLocation | buildingNumber | officeNumber | officeName | country |
------------------------------------------------------------------------------------
|   2344454  |      2       |       44       |       1      |  Telford   |    UK   |
|   2344451  |     44       |       22       |       1      |  Telford   |    UK   |
|   2344456  |     92       |       12       |       2      |  Hamburg   |    GER  |
|   2344456  |     12       |       33       |       2      |  Hamburg   |    GER  |
------------------------------------------------------------------------------------

我需要检索最新的currentReading(基于LatestData中的dateTimeStamp字段)以及以下字段,分组在unitNumber上:

currentReading, unitNumber, officeName, country, buildingNumber

还需要注意的是,记录可以按任何顺序到达。

以下是我尝试过的一个例子,我已经尝试了很多但不幸的是我没有将它们打开:

SELECT 
      a.currentReading
      ,MAX(b.dateTimeStamp)
      ,c.unitNumber
      ,c.country
      ,c.officeName
  FROM [VolumeData] a INNER JOIN LatestData b ON a.dataid = b.dataid INNER JOIN
    unitData c ON c.[unitNumber] = b.[unitNumber] AND c.[unitLocation] = b.[unitLocation];

这导致:Column 'VolumeData.currentReading' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

任何建议都将不胜感激!我尝试的所有内容都会导致检索太多行或导致逻辑SQL错误。我还应该补充说,这些表包含数百万行,并且每天都在增长,所以我正在寻找一种非常有效的方法来实现这一目标。

谢谢!

3 个答案:

答案 0 :(得分:2)

您可以使用ROW_NUMBER()来订购日期。然后你就拿第一个,这对应于最新的日期。

SELECT *
FROM (
    SELECT a.currentReading
        , b.dateTimeStamp
        , c.unitNumber
        , c.country
        , c.officeName
        , ROW_NUMBER() OVER (PARTITION BY c.unitNumber ORDER BY b.dateTimeStamp DESC) AS rowNum
    FROM [VolumeData] a 
    INNER JOIN LatestData b ON a.dataid = b.dataid 
    INNER JOIN unitData c ON c.[unitNumber] = b.[unitNumber] AND c.[unitLocation] = b.[unitLocation]
) a
WHERE rowNum = 1

答案 1 :(得分:1)

不是完整的代码,而是建议 - 可以通过CTE中的ROW_NUMBER函数实现

相似

https://social.msdn.microsoft.com/Forums/sqlserver/en-US/597b876e-eb00-4013-a613-97c377408668/rownumber-and-cte?forum=transactsql

http://datachix.com/2010/02/10/use-a-common-table-expression-and-the-row_number-function-to-eliminate-duplicate-rows-3/

只需谷歌CTE + ROW_NUMBER即可获得更多示例。

因此,在CTE中,您加入所有必需的表,并在分区上应用ROW_NUMBER,按dateTimestamp(DESC)排序,然后在使用该CTE的查询中使用WHERE CTE_name.Rank = 1。

答案 2 :(得分:1)

Eric的答案相同的逻辑,使用CTE可能会更清晰并加入较少的记录。

DECLARE @VolumeData TABLE
( 
    dataid          int, 
    currentReading  int
); 

INSERT INTO @VolumeData VALUES(1, 22);
INSERT INTO @VolumeData VALUES(7, 33);
INSERT INTO @VolumeData VALUES(9, 25);
INSERT INTO @VolumeData VALUES(12,12);

DECLARE @LatestData TABLE
( 
    dataid          int, 
    unitNumber      int,
    unitLocation    int,
    dateTimeStamp   datetime
); 

INSERT INTO @LatestData VALUES(1,  2344454, 2,  '2017-07-10 13:16:29.000');
INSERT INTO @LatestData VALUES(7,  2344451, 44, '2017-07-10 13:22:29.000');
INSERT INTO @LatestData VALUES(9,  2344456, 92, '2017-07-10 12:16:29.000');
INSERT INTO @LatestData VALUES(12, 2344456, 12, '2017-07-10 12:13:23.000');

DECLARE @UnitData TABLE
( 
    unitNumber      int,
    unitLocation    int,
    buildingNumber  int,
    officeNumber    int,
    officeName      varchar(50),
    country         varchar(50)
); 

INSERT INTO @UnitData VALUES(2344454, 2,  44, 1, 'Telford', 'UK');
INSERT INTO @UnitData VALUES(2344451, 44, 22, 1, 'Telford', 'UK');
INSERT INTO @UnitData VALUES(2344456, 92, 12, 2, 'Hamburg', 'GER');
INSERT INTO @UnitData VALUES(2344456, 12, 33, 2, 'Hamburg', 'GER');

WITH LatestData_CTE (dataid, unitNumber, unitLocation, dateTimeStamp, rowNum)  
AS  
(  
    SELECT  dataid
          , unitNumber
          , unitLocation
          , dateTimeStamp
          , ROW_NUMBER() OVER (PARTITION BY unitNumber ORDER BY dateTimeStamp DESC) AS rowNum
    FROM @LatestData
)  
SELECT currentReading, l.unitNumber, officeName, country, buildingNumber
 FROM LatestData_CTE l 
    INNER JOIN @VolumeData v ON v.dataid = l.dataid 
    INNER JOIN @UnitData u ON u.[unitNumber] = l.[unitNumber] AND u.[unitLocation] = l.[unitLocation]
WHERE l.rowNum = 1