我有一个CommunityMembers的视图,其中每个都有ID的主键。有些人还有来自其他系统的旧身份证,有些人有配偶身份证。所有ID都是唯一的。
e.g:
ID | Name | OldID | SpouseID | SpouseName
1 | John.Smith | o71 | s99 | Jenna.Smith
2 | Jane.Doe | o72 | |
3 | Jessie.Jones | |
我还有一个ActivityDates视图,其中每个社区成员可以有多个活动日期。旧ID和Spouse ID的活动日期。 (不幸的是,我无法通过将旧ID转换为新ID来清理数据
e.g:
ID | ActivityDate | ActiviyType | ActivityGroup
1 | 2017-12-31 | 1 | 1
1 | 2017-12-31 | 3 | 2
1 | 2017-12-31 | 7 | 1
2 | 2017-12-31 | 1 | 1
3 | 2017-12-31 | 1 | 1
o72 | 2010-12-31 | 1 | 2
o72 | 2010-12-31 | 3 | 1
s99 | 2017-12-31 | 1 | 1
s99 | 2017-12-31 | 2 | 1
我可以使用以下方法以我需要的方式选择数据,多个案例选择运行3次以检查3个可能的ID,尽管它非常慢,因为它每次运行多次选择查询记录:
SELECT
C.ID,
C.Name,
C.OldID,
C.SpouseID,
C.SpouseName,
CASE
WHEN C.ID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
OR C.OldID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
OR C.SpouseID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
THEN 'Yes'
ELSE ''
END AS Result i.e. HasTheCommunityMemberOrTheirSpouseOnlyEverAttendedActivityTypeAndGroup1After2016?
所以我希望得到以下结果,我得到的结果很慢:
ID | Name | OldID | SpouseID | SpouseName | Result
1 | John.Smith | o71 | s99 | Jenna.Smith |
2 | Jane.Doe | o72 | | | Yes
3 | Jessie.Jones | | | | Yes
我很欣赏有更好的方法可以做到这一点,我很高兴听到建议,虽然我在改变这个系统方面的灵活性有限,所以除了我要问的是我怎样才能让它更快?理想情况下,我想使用连接到表并使用条件,虽然我无法解决。 e.g。
SELECT
C.ID, C.Name,
C.OldID, C.SpouseID, C.SpouseName,
R.Result
FROM
CommunityMembers C
JOIN
CASE WHEN Date ... Type ... Group ... ELSE ... IN ... Not Exist ... THEN ... ActivityDates R
或
SELECT
C.ID, C.Name,
C.OldID, C.SpouseID, C.SpouseName,
CASE
WHEN R.Date ... R.Type ... R.Group ... ELSE ... THEN 'Yes' END AS Result
FROM
CommunityMembers C
JOIN
ActivityDates R
我怀疑我需要进行多次连接,但我不知道如何编写它。
谢谢
答案 0 :(得分:0)
答案 1 :(得分:0)
这是使用“可选连接”的另一种模式,可能会或可能不会更好。它与你的输出并不完全相同 - 我不确定你在那之后是什么。
grant all on `db_name.*` to `'username'@'localhost'`;
在某些情况下,这会“重复计算”,但如果您确定整个ID的集合是唯一的,那么您应该没问题。如果您只想知道某个活动记录是否存在,那么您可以使用SELECT A.*,
COALESCE(C1.Name, C2.Name, C3.Name) As Name
FROM ActivityDates A
LEFT OUTER JOIN CommunityMember As C1
ON C1.ID = A.ID
LEFT OUTER JOIN CommunityMember As C2
ON C2.OldID = CAST(A.ID AS VARCHAR(12))
LEFT OUTER JOIN CommunityMember As C3
ON C2.SpouseID = CAST(A.ID AS VARCHAR(12))
来加快速度,但我不会遵循您的逻辑。
答案 2 :(得分:0)
您需要每个ID的表ActivityDates
中的信息。因此,按ID分组并在HAVING
中过滤所需的ID:
SELECT ID
FROM ActivityDates
WHERE ActivityDate > '2016-12-31'
GROUP BY ID
HAVING COUNT(CASE WHEN ActiviyType = 1 AND ActiviyGroup = 1 THEN 1 END) > 1
AND COUNT(CASE WHEN ActiviyType > 1 AND ActiviyGroup > 1 THEN 1 END) = 0
您可以将其与EXISTS
子句一起使用:
select
c.*,
case when exists
(
SELECT a.ID
FROM ActivityDates a
WHERE a.ActivityDate > '2016-12-31'
AND a.ID in (c.id, c.oldid, c.spouseid)
GROUP BY a.ID
HAVING COUNT(CASE WHEN ActiviyType = 1 AND ActiviyGroup = 1 THEN 1 END) > 1
AND COUNT(CASE WHEN ActiviyType > 1 AND ActiviyGroup > 1 THEN 1 END) = 0
) then 'Yes' else '' end as result
from c;
加速提高速度的适当指数可能是
create index idx1 on ActivityDates (ID, ActivityDate, ActivityType, ActivityGroup);
create index idx2 on ActivityDates (ActivityDate, ID, ActivityType, ActivityGroup);
找出其中一个是否被使用并放弃另一个(或两者都被使用)。
使用非相关的子查询(这意味着我们必须多次访问)可能会表现得更好。如果它甚至涉及不同的执行计划,它取决于优化器:
with good_ids as
(
select id
from activitydates
where activitydate > '2016-12-31'
group by id
having count(case when activiytype = 1 and activiygroup = 1 then 1 end) > 1
and count(case when activiytype > 1 and activiygroup > 1 then 1 end) = 0
)
select
c.*,
case when id in (select id from good_ids)
or oldid in (select id from good_ids)
or spouseid in (select id from good_ids)
then 'Yes' else ''
end as result
from c;
答案 3 :(得分:0)
您应该尝试解释输出。很难找到正确的商业。来自错误查询的规则。
这样你就可以从这里得到最好的查询。只是再次尝试解释为什么id 2,3是的。然后我将重写我的查询。
你要承诺的第二大错误就是不理解你的商业。规则,无需编写正确的查询,您将创建索引
试试这个,
declare @t table(ID varchar(20),Name varchar(40),OldID varchar(20), SpouseID varchar(20)
, SpouseName varchar(40))
insert into @t VALUES
('1','John.Smith','o71' ,'s99','Jenna.Smith')
,('2','Jane.Doe' ,'o72',null,null)
,('3','Jessie.Jones',null,null,null)
--select * from @t
declare @ActivityDates table(ID varchar(20), ActivityDate date
, ActiviyType int, ActivityGroup int)
insert into @ActivityDates VALUES
('1','2017-12-31',1, 1)
,('1','2017-12-31',3, 2)
,('1','2017-12-31',7, 1)
,('2','2017-12-31',1, 1)
,('3','2017-12-31',1, 1)
,('o72','2010-12-31',1, 2)
,('o72','2010-12-31',3, 1)
,('s99','2017-12-31',1, 1)
,('s99','2017-12-31',2, 1)
SELECT t.*
,case when tbl.id is not null then 'Yes' else null end Remarks
from @t t
left JOIN
(select * from @ActivityDates AD
WHERE(( ActivityDate > '2016-12-31' AND ActiviyType = 1 AND ActivityGroup = 1
AND NOT EXISTS (SELECT ID FROM @ActivityDates ad1 WHERE (ad.id=ad1.id) AND
ActivityDate > '2016-12-31' AND (ActiviyType > 1 or ActivityGroup > 1))
)
))tbl
on t.ID=tbl.ID