我的数据仓库中有几个新表,我需要找到一种正确连接的方法。我的最终目标是根据客户的首次程序注册来查看其全部信息。
提前道歉,因为这篇文章的背景很长。
为此,我正在SSMS中工作。这里有7个相关的表格,以及三种计划类型(活动,联赛,日营)。下面是虚拟数据。
个人
personID firstname lastname
1 mark smith
2 mike boy
活动
activityID activityName createdDate activityType
100 skating 01-01-2019 january
200 hockey 01-10-2019 february
活动注册
activityID activityName personID createdDate paidAmount
100 skating 1 01-06-2019 10
200 hockey 1 01-12-2019 25
100 skating 2 01-13-2019 10
联盟
leagueID leagueName createdDate leagueType
1 Adult Hockey 01-10-19 West
联盟注册
leagueID leagueName personID createdDate paidAmount
1 Adult Hockey 1 01-16-19 100
1 Adult Hockey 2 01-12-19 100
还有日营和日营注册表,它们的数据设置与上述四个表相同。
select I.personid,
I.firstname,
I.lastname,
'Activity' as Source,
(isnull(ActivityPay,0) + isnull(LeaguePay,0) + isnull(DCPay,0)) as 'TotalPaid',
(isnull(TotalActivities,0) + isnull(TotalLeagues,0) + isnull(TotalDCs,0)) as 'TotalRegistrations'
from Individuals I
left join (
select PersonID, sum(paidamount) as 'ActivityPay', count(registrationid) as 'TotalActivities'
from ActivityRegistration
group by PersonID
) A on I.PersonID = A.PersonID
left join (
select personid, sum(PaidAmount) as 'LeaguePay', count(registrationid) as 'TotalLeagues'
from ro.vw_MaxGalaxy_LeaguePlayerRegistrations
group by PersonID, ArenaName
) L on I.PersonID = L.PersonID
where I.PersonID in
(
select PersonID
from ActivityRegistration
where CreatedDate in (
select
(
select min(Event)
from (values (firstleague), (firstactivity), (firstdaycamp)) as v (Event)
) as FirstRegistration
from
(
select i.personid, i.FirstName, i.LastName, min(l.createddate) as 'firstleague', min(a.createddate) as 'firstactivity', min(d.createddate) as 'firstdaycamp'
from Individuals I
left join ActivityRegistration A on I.PersonID = A.PersonID
left join LeaguePlayerRegistration L on I.PersonID = L.PersonID
left join DayCampRegistration D on I.PersonID = D.PersonID
group by i.PersonID, i.firstname, i.lastname
) as derived
)
)
这基本上是我想出的。这错误地假设了createdDate可以用作唯一标识符,并且一次仅查看一个程序类型(请注意,它仅从ActivityRegistration
中提取;我UNION
与另外两个程序一起提取在我的SSMS环境中输入)。这样可以使我了解一个人及其总计划/总支出,但不允许我查看第一个计划。
我试图以其他方式拉动它,但是我一直不停地拉动min(createdDate)以及拉动ActivityID。如果按ActivityID和PersonID进行分组,则每个ActivityID都将为min(createdDate)。
最终的目标是拥有一个表,将所有这些信息与客户级别相关联(并包括一条简单的'Activity' as Source
行)。
目标表
personID firstName lastName firstProgramSource firstProgramID firstProgramName firstProgramType totalPrograms totalSpend
1 mark smith Activity 100 skating january 3 135
2 mike boy League 1 Adult Hockey West 3 110
如果我没逛得太多,有什么方法可以实现我的尝试?
答案 0 :(得分:1)
您非常接近。似乎您陷入了WHERE子句中。一个更简单的策略是收集两种不同的聚合:将SUM / Count与Min / Max分开。您的查询看起来像这样:
select I.personid,
I.firstname,
I.lastname,
--'Activity' as Source,
CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN 'Activity'
WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN 'League'
ELSE 'Neither'
END AS FirstProgramSource,
CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityName
WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueName
ELSE 'Neither'
END AS FirstProgramName,
CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityType
WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueType
ELSE 'Neither'
END AS FirstProgramType,
(isnull(ActivityPay,0) + isnull(LeaguePay,0) + isnull(DCPay,0)) as TotalPaid,
(isnull(TotalActivities,0) + isnull(TotalLeagues,0) + isnull(TotalDCs,0)) as TotalRegistrations
from Individuals I
left join (
select PersonID, sum(paidamount) as ActivityPay, count(registrationid) as TotalActivities
from ActivityRegistration
group by PersonID
) A on I.PersonID = A.PersonID
left join (
select PersonID, sum(PaidAmount) as LeaguePay, count(registrationid) as TotalLeagues
from LeagueRegistrations
group by PersonID--, ArenaName
) L on I.PersonID = L.PersonID
-- Get the "First Activity" separately from your other aggregate (sum, count, etc).
left join ( --TOP 1 will eliminate duplicates, if you have two with the same FirstDate
select TOP 1 PersonID, A.ActivityID, ActivityName, ProgramType, FirstDate
from ( -- SELECT PersonID, ActivityID, Min(CreatedDate) FirstDate
SELECT PersonID, Min(CreatedDate) FirstDate
FROM ActivityRegistration
GROUP BY PersonID --, ActivityID
) AFirst
INNER JOIN ActivityRegistration AR ON AFirst.PersonID = AR.PersonID
AND AFirst.FirstDate = AR.CreatedDate
INNER JOIN Activity A ON AR.ActivityID = A.ActivityID
) A1 on I.PersonID = A1.PersonID
left join (
select PersonID, L.LeagueID, LeagueName, LeagueType, FirstDate
from (SELECT PersonID, LeagueID, Min(CreatedDate) FirstDate
FROM LeagueRegistration
GROUP BY PersonID, LeagueID
) LR
INNER JOIN League L ON LR.LeagueID = L.LeagueID
) L1 on I.PersonID = L1.PersonID
由于我没有数据库,所以我不能100%肯定会在第一次尝试时运行,但是我很接近,您可以看到这个概念。
最后一件事:如果您有ActivityRecord
行,其中一个人重复了“ FirstDate”行,那么您将不得不添加另一个GROUP BY子句:
select I.personid,
I.firstname,
I.lastname,
--'Activity' as Source,
CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN 'Activity'
WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN 'League'
ELSE 'Neither'
END AS FirstProgramSource,
CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityName
WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueName
ELSE 'Neither'
END AS FirstProgramName,
CASE WHEN IsNull(A1.FirstDate,'1/1/1900') < IsNull(L1.FirstDate,'1/1/1900') THEN A1.ActivityType
WHEN IsNull(A1.FirstDate,'1/1/1900') > IsNull(L1.FirstDate,'1/1/1900') THEN L1.LeagueType
ELSE 'Neither'
END AS FirstProgramType,
(isnull(ActivityPay,0) + isnull(LeaguePay,0) + isnull(DCPay,0)) as TotalPaid,
(isnull(TotalActivities,0) + isnull(TotalLeagues,0) + isnull(TotalDCs,0)) as TotalRegistrations
from Individuals I
left join (
select PersonID, sum(paidamount) as ActivityPay, count(registrationid) as TotalActivities
from ActivityRegistration
group by PersonID
) A on I.PersonID = A.PersonID
left join (
select PersonID, sum(PaidAmount) as LeaguePay, count(registrationid) as TotalLeagues
from LeagueRegistrations
group by PersonID--, ArenaName
) L on I.PersonID = L.PersonID
-- Get the "First Activity" separately from your other aggregate (sum, count, etc).
left join (
select PersonID, A.ActivityID, A.ActivityName, A.ActivityType, FirstDate
from ( --one more GROUP BY to include the MIN ActivityID for any FirstDate.
SELECT PersonID, Min(A.ActivityID) ActivityID, FirstDate
FROM (
-- get the first date
SELECT PersonID, Min(CreatedDate) FirstDate
FROM ActivityRegistration
GROUP BY PersonID
) AFD INNER JOIN ActivityRecord AR1
ON AFD.PersonID=AR1.PersonID AND AFD.FirstDate=AR1.CreatedDate
GROUP BY PersonID, FirstDate
) AFirst
INNER JOIN Activity A ON AFirst.ActivityID = A.ActivityID
) A1 on I.PersonID = A1.PersonID
-- similar pattern for League