DATENAME导致'Distinct'被忽略

时间:2017-04-07 23:06:59

标签: sql-server tsql

将DATENAME()函数添加到查询会导致重复行,尽管“distinct”。

TREE - TreeId, CityId, DatePlanted
WATER - WaterId, TreeId(fk), DateWatered

表1是表2中的一对多

TREE表中的每一行都表示植树。水表是浇灌那棵树的一个例​​子。一棵树一年浇水多次。你明白了。

我需要返回一份报告,显示种植的树木数量,按月份和浇水的次数。

SELECT t.CityId
        , COUNT(distinct t.TreeId) as 'Trees Planted'
        , COUNT(w.TreeId) as 'Trees Watered'        
FROM TREE t
JOIN WATER w ON t.TreeId = w.TreeId
WHERE w.DateWatered between @Start AND @End
GROUP BY t.CityId

这很好用。然而,当我尝试按月分组时,t.Treeid不再明显,因此树的数量太高。

SELECT t.CityId
    , DATENAME(month, w.DateWatered)
        , COUNT(distinct t.TreeId) as 'Trees Planted'
        , COUNT(w.TreeId) as 'Trees Watered'        
FROM TREE t
JOIN WATER w ON t.TreeId = w.TreeId
WHERE w.DateWatered between @Start AND @End
GROUP BY t.CityId, DATENAME(month, w.DateWatered)
编辑:我发现为什么我会得到重复但不知道如何修复它。如果一棵树在2016年4月浇灌,那么在2016年5月再次浇灌,我得到了种植的2棵树和2棵树浇灌的地方应该是一棵树种植和2次浇水。如果我在没有返回日期的情况下进行第一次查询,我会得到正确的数字。因此,通过添加日期,即使我按年份分组,然后按月分组,同一树的两次浇水,它也显示两次种植的树。我目前正在研究使用CTE来保持查询的每个部分分开。

2 个答案:

答案 0 :(得分:1)

   SELECT t.CityId
       , ISNULL(DATENAME(month, w.DateWatered), DATENAME(month, t.DatePlanted))
       , (SELECT COUNT(tDistinct.TreeId) FROM TREE tDistinct 
        WHERE tDistinct.TreeId = t.TreeId AND DATENAME(month, tDistinct.DatePlanted) = DATENAME(month, t.DateWatered) AND t.DatePlanted between @Start AND @End) as 'Trees Planted'
      , COUNT(w.TreeId) as 'Trees Watered'        
     FROM TREE t
     JOIN WATER w ON t.TreeId = w.TreeId
    WHERE w.DateWatered between @Start AND @End
    GROUP BY t.CityId, DATENAME(month, w.DateWatered), DATENAME(month, t.DatePlanted)

这里唯一的缺点是这样一个场景,在一个月里没有树被种植,树上种植的日期将为空,所以我添加了一个检查...不确定你的数据是什么样的,这样它可能会使感觉忽略ISNULL检查以支持原始分组

EDITED: 根据您的要求,我不认为CTE是必要的;根据您提供的其他信息,我已稍微更改了查询以满足您的需求:

   `SELECT DATENAME(MONTH, myConsolidatedTree.DateAction) as myDate
          ,(SELECT COUNT(*) 
               FROM TREE AS t
          WHERE 
            DATENAME(MONTH, myConsolidatedTree.DateAction) = DATENAME(MONTH, t.DatePlanted)
           ) as myNumberOfPlanted
           ,(SELECT COUNT(*) 
               FROM WATER AS w 
            WHERE 
                DATENAME(MONTH, myConsolidatedTree.DateAction) = DATENAME(MONTH, w.DateWatered)
                    ) as myNumberOfWatered

        FROM(
            SELECT t.DatePlanted as DateAction
                   ,t.TreeId as IdAction
                   ,'PLANTED' as TreeAction
                FROM TREE t

            UNION

            SELECT w.DateWatered as DateAction
                   ,w.TreeId as IdAction
                   ,'WATERED' as TreeAction
                FROM WATER w) as myConsolidatedTree
    WHERE myConsolidatedTree.DateAction between @StartDate and @EndDate
    GROUP BY DATENAME(MONTH, myConsolidatedTree.DateAction), DATEPART(MONTH, myConsolidatedTree.DateAction)
    ORDER BY DATEPART(MONTH, myConsolidatedTree.DateAction)`

虽然统一子查询包含的信息多于此问题所需的信息,但我将其他TreeId和派生的TreeAction列保留在那里,以防您将来遇到此需要。

答案 1 :(得分:1)

这演示了如何将问题分解为公共表表达式(CTE)中的步骤。请注意,您可以使用其中一个注释select替换最终select以查看中间结果。它是测试,调试或理解正在发生的事情的便捷方式。

您遇到的一个问题是尝试仅根据浇水日期汇总数据。如果在一个没有浇水的月份种植一棵树,那么它就不会被计算在内。下面的代码总结了日期范围内的种植和浇水,然后将它们组合成一个结果集。

-- Sample data.
declare @Trees as Table ( TreeId Int Identity, CityId Int, DatePlanted Date );
declare @Waterings as Table ( WateringId Int Identity, TreeId Int, DateWatered Date );
insert into @Trees ( CityId, DatePlanted ) values
  ( 1, '20160115' ), ( 1, '20160118' ),
  ( 1, '20160308' ), ( 1, '20160318' ), ( 1, '20160118' ),
  ( 1, '20170105' ),
  ( 1, '20170205' ),
  ( 1, '20170401' ),
  ( 2, '20160113' ), ( 2, '20160130' ),
  ( 2, '20170226' ), ( 2, '20170227' ), ( 2, '20170228' );
insert into @Waterings ( TreeId, DateWatered ) values
  ( 1, '20160122' ), ( 1, '20160129' ), ( 1, '20160210' ), ( 1, '20160601' ),
  ( 5, '20160120' ), ( 5, '20160127' ), ( 5, '20160215' ), ( 5, '20160301' ), ( 5, '20160515' );
select * from @Trees;
select * from @Waterings;

-- Combine the data.
declare @StartDate as Date = '20100101', @EndDate as Date = '20200101';
with
  -- Each tree with the year and month it was planted.
  TreesPlanted as (
    select CityId, TreeId,
      DatePart( year, DatePlanted ) as YearPlanted,
      DatePart( month, DatePlanted ) as MonthPlanted
      from @Trees
      where @StartDate <= DatePlanted and DatePlanted <= @EndDate ),
  -- Tree plantings summarized by city, year and month.
  TreesPlantedSummary as (
    select CityId, YearPlanted, MonthPlanted, Count( TreeId ) as Trees
      from TreesPlanted
      group by CityId, YearPlanted, MonthPlanted ),
  -- Each watering and the year and month it occurred.
  TreesWatered as (
    select CityId, W.TreeId,
      DatePart( year, W.DateWatered ) as YearWatered,
      DatePart( month, W.DateWatered ) as MonthWatered
      from @Trees as T left outer join
        @Waterings as W on W.TreeId = T.TreeId
      where @StartDate <= W.DateWatered and W.DateWatered <= @EndDate ),
  -- Waterings summarized by city, year and month.
  TreesWateredSummary as (
    select CityId, YearWatered, MonthWatered,
      Count( distinct TreeId ) as Trees, Count( TreeId ) as Waterings
      from TreesWatered
      group by CityId, YearWatered, MonthWatered )
  -- Combine the plantings and waterings for the specified period.
  select Coalesce( TPS.CityId, TWS.CityId ) as CityId,
    Coalesce( TPS.YearPlanted, TWS.YearWatered ) as Year,
    Coalesce( TPS.MonthPlanted, TWS.MonthWatered ) as Month,
    Coalesce( TPS.Trees, 0 ) as TreesPlanted,
    Coalesce( TWS.Trees, 0 ) as TreesWatered,
    Coalesce( TWS.Waterings, 0 ) as Waterings
    from TreesPlantedSummary as TPS full outer join
      TreesWateredSummary as TWS on TWS.CityId = TPS.CityId and
      TWS.YearWatered = TPS.YearPlanted and TWS.MonthWatered = TPS.MonthPlanted
     order by CityId, Year, Month;
-- Alternative queries for testing/debugging/understanding:
--    select * from TreesPlantedSummary order by CityId, YearPlanted, MonthPlanted;
--    select * from TreesWateredSummary order by CityId, YearWatered, MonthWatered;

现在你想要在结果中包含缺失的月份(没有活动),嗯?