在SSMS中联接多个表而无需重复计算记录

时间:2019-02-15 23:47:43

标签: sql-server

我的数据仓库中有几个新表,我需要找到一种正确连接的方法。我的最终目标是根据客户的首次程序注册来查看其全部信息。

提前道歉,因为这篇文章的背景很长。

为此,我正在SSMS中工作。这里有7个相关的表格,以及三种计划类型(活动,联赛,日营)。下面是虚拟数据。

个人

personID  firstname  lastname
1         mark       smith
2         mike       boy

活动

activityID   activityName   createdDate  activityType
100          skating        01-01-2019   january
200          hockey         01-10-2019   february

活动注册

activityID  activityName  personID  createdDate  paidAmount
100         skating       1         01-06-2019   10
200         hockey        1         01-12-2019   25
100         skating       2         01-13-2019   10

联盟

leagueID  leagueName    createdDate   leagueType
1         Adult Hockey  01-10-19      West

联盟注册

leagueID  leagueName   personID  createdDate  paidAmount
1         Adult Hockey 1         01-16-19     100
1         Adult Hockey 2         01-12-19     100

还有日营日营注册表,它们的数据设置与上述四个表相同。

select I.personid, 
       I.firstname, 
       I.lastname,
       'Activity' as Source,
       (isnull(ActivityPay,0) + isnull(LeaguePay,0) + isnull(DCPay,0)) as 'TotalPaid',
       (isnull(TotalActivities,0) + isnull(TotalLeagues,0) + isnull(TotalDCs,0)) as 'TotalRegistrations'
from Individuals I

       left join (
            select PersonID, sum(paidamount) as 'ActivityPay', count(registrationid) as 'TotalActivities'
            from ActivityRegistration
            group by PersonID
                 ) A on I.PersonID = A.PersonID

       left join (
            select personid, sum(PaidAmount) as 'LeaguePay', count(registrationid) as 'TotalLeagues'
            from ro.vw_MaxGalaxy_LeaguePlayerRegistrations
            group by PersonID, ArenaName
                 ) L on I.PersonID = L.PersonID

where I.PersonID in
   (
   select PersonID
   from ActivityRegistration
   where CreatedDate in (
      select
         (
         select min(Event)
         from (values (firstleague), (firstactivity), (firstdaycamp)) as v (Event)
         ) as FirstRegistration
         from
             (
             select i.personid, i.FirstName, i.LastName, min(l.createddate) as 'firstleague', min(a.createddate) as 'firstactivity', min(d.createddate) as 'firstdaycamp'
             from Individuals I
             left join ActivityRegistration A on I.PersonID = A.PersonID
             left join LeaguePlayerRegistration L on I.PersonID = L.PersonID
             left join DayCampRegistration D on I.PersonID = D.PersonID
             group by i.PersonID, i.firstname, i.lastname 
             ) as derived
         )
    )

这基本上是我想出的。这错误地假设了createdDate可以用作唯一标识符,并且一次仅查看一个程序类型(请注意,它仅从ActivityRegistration中提取;我UNION与另外两个程序一起提取在我的SSMS环境中输入)。这样可以使我了解一个人及其总计划/总支出,但不允许我查看第一个计划。

我试图以其他方式拉动它,但是我一直不停地拉动min(createdDate)以及拉动ActivityID。如果按ActivityID和PersonID进行分组,则每个ActivityID都将为min(createdDate)。

最终的目标是拥有一个表,将所有这些信息与客户级别相关联(并包括一条简单的'Activity' as Source行)。

目标表

personID firstName lastName firstProgramSource firstProgramID firstProgramName firstProgramType totalPrograms  totalSpend
1        mark      smith    Activity           100            skating          january          3              135 
2        mike      boy      League             1              Adult Hockey     West             3              110  

如果我没逛得太多,有什么方法可以实现我的尝试?

1 个答案:

答案 0 :(得分:1)

您非常接近。似乎您陷入了WHERE子句中。一个更简单的策略是收集两种不同的聚合:将SUM / Count与Min / Max分开。您的查询看起来像这样:

select I.personid, 
        I.firstname, 
        I.lastname,
        --'Activity' as Source,
        CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN 'Activity' 
            WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN 'League'
            ELSE 'Neither'
        END AS FirstProgramSource,
        CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityName 
            WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueName
            ELSE 'Neither'
        END AS FirstProgramName,
        CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityType 
            WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueType
            ELSE 'Neither'
        END AS FirstProgramType,       
        (isnull(ActivityPay,0) + isnull(LeaguePay,0) + isnull(DCPay,0)) as TotalPaid,
        (isnull(TotalActivities,0) + isnull(TotalLeagues,0) + isnull(TotalDCs,0)) as TotalRegistrations
from Individuals I

        left join (
            select PersonID, sum(paidamount) as ActivityPay, count(registrationid) as TotalActivities
            from ActivityRegistration
            group by PersonID
                    ) A on I.PersonID = A.PersonID

        left join (
            select PersonID, sum(PaidAmount) as LeaguePay, count(registrationid) as TotalLeagues
            from LeagueRegistrations
            group by PersonID--, ArenaName
                    ) L on I.PersonID = L.PersonID

--   Get the "First Activity" separately from your other aggregate (sum, count, etc).
        left join ( --TOP 1 will eliminate duplicates, if you have two with the same FirstDate
            select TOP 1 PersonID, A.ActivityID, ActivityName, ProgramType, FirstDate 
            from (   -- SELECT PersonID, ActivityID, Min(CreatedDate) FirstDate 
                    SELECT PersonID, Min(CreatedDate) FirstDate 
                FROM ActivityRegistration 
                GROUP BY PersonID --, ActivityID
                ) AFirst
                INNER JOIN ActivityRegistration AR ON AFirst.PersonID = AR.PersonID 
                    AND AFirst.FirstDate = AR.CreatedDate
                INNER JOIN Activity A ON AR.ActivityID = A.ActivityID
            ) A1 on I.PersonID = A1.PersonID

        left join (
            select PersonID, L.LeagueID, LeagueName, LeagueType, FirstDate 
            from (SELECT PersonID, LeagueID, Min(CreatedDate) FirstDate 
                    FROM LeagueRegistration 
                    GROUP BY PersonID, LeagueID
                    ) LR 
                    INNER JOIN League L ON LR.LeagueID = L.LeagueID
            ) L1 on I.PersonID = L1.PersonID

由于我没有数据库,所以我不能100%肯定会在第一次尝试时运行,但是我很接近,您可以看到这个概念。

最后一件事:如果您有ActivityRecord行,其中一个人重复了“ FirstDate”行,那么您将不得不添加另一个GROUP BY子句:

select I.personid, 
        I.firstname, 
        I.lastname,
        --'Activity' as Source,
        CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN 'Activity' 
            WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN 'League'
            ELSE 'Neither'
        END AS FirstProgramSource,
        CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityName 
            WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueName
            ELSE 'Neither'
        END AS FirstProgramName,
        CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityType 
            WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueType
            ELSE 'Neither'
        END AS FirstProgramType,       
        (isnull(ActivityPay,0) + isnull(LeaguePay,0) + isnull(DCPay,0)) as TotalPaid,
        (isnull(TotalActivities,0) + isnull(TotalLeagues,0) + isnull(TotalDCs,0)) as TotalRegistrations
from Individuals I

        left join (
            select PersonID, sum(paidamount) as ActivityPay, count(registrationid) as TotalActivities
            from ActivityRegistration
            group by PersonID
            ) A on I.PersonID = A.PersonID

        left join (
            select PersonID, sum(PaidAmount) as LeaguePay, count(registrationid) as TotalLeagues
            from LeagueRegistrations
            group by PersonID--, ArenaName
            ) L on I.PersonID = L.PersonID

--   Get the "First Activity" separately from your other aggregate (sum, count, etc).
        left join ( 
            select PersonID, A.ActivityID, A.ActivityName, A.ActivityType, FirstDate 
            from (  --one more GROUP BY to include the MIN ActivityID for any FirstDate. 
                SELECT PersonID, Min(A.ActivityID) ActivityID, FirstDate
                FROM (
                    -- get the first date
                        SELECT PersonID, Min(CreatedDate) FirstDate 
                    FROM ActivityRegistration 
                    GROUP BY PersonID 
                    ) AFD INNER JOIN ActivityRecord AR1 
                      ON AFD.PersonID=AR1.PersonID AND AFD.FirstDate=AR1.CreatedDate
                GROUP BY PersonID, FirstDate
                ) AFirst                    
                INNER JOIN Activity A ON AFirst.ActivityID = A.ActivityID
            ) A1 on I.PersonID = A1.PersonID

-- similar pattern for League