在查询的SELECT语句中使用JOE中的CASE而不是CASE更快地进行SQL查询?

时间:2017-04-11 05:17:16

标签: sql sql-server

我有一个CommunityMembers的视图,其中每个都有ID的主键。有些人还有来自其他系统的旧身份证,有些人有配偶身份证。所有ID都是唯一的。

e.g:

ID | Name         | OldID   | SpouseID  | SpouseName
1  | John.Smith   | o71     | s99       | Jenna.Smith
2  | Jane.Doe     | o72     |           | 
3  | Jessie.Jones |         |       

我还有一个ActivityDates视图,其中每个社区成员可以有多个活动日期。旧ID和Spouse ID的活动日期。 (不幸的是,我无法通过将旧ID转换为新ID来清理数据

e.g:

ID  | ActivityDate | ActiviyType | ActivityGroup
1   | 2017-12-31   | 1           | 1
1   | 2017-12-31   | 3           | 2
1   | 2017-12-31   | 7           | 1
2   | 2017-12-31   | 1           | 1
3   | 2017-12-31   | 1           | 1
o72 | 2010-12-31   | 1           | 2
o72 | 2010-12-31   | 3           | 1
s99 | 2017-12-31   | 1           | 1
s99 | 2017-12-31   | 2           | 1

我可以使用以下方法以我需要的方式选择数据,多个案例选择运行3次以检查3个可能的ID,尽管它非常慢,因为它每次运行多次选择查询记录:

SELECT 
    C.ID, 
    C.Name,
    C.OldID,
    C.SpouseID,
    C.SpouseName,
    CASE 
       WHEN C.ID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
            AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
            OR C.OldID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
            AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
            OR C.SpouseID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
            AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
          THEN 'Yes' 
          ELSE '' 
       END AS Result i.e. HasTheCommunityMemberOrTheirSpouseOnlyEverAttendedActivityTypeAndGroup1After2016?

所以我希望得到以下结果,我得到的结果很慢:

ID | Name         | OldID   | SpouseID  | SpouseName   | Result
1  | John.Smith   | o71     | s99       | Jenna.Smith  | 
2  | Jane.Doe     | o72     |           |              | Yes
3  | Jessie.Jones |         |           |              | Yes

我很欣赏有更好的方法可以做到这一点,我很高兴听到建议,虽然我在改变这个系统方面的灵活性有限,所以除了我要问的是我怎样才能让它更快?理想情况下,我想使用连接到表并使用条件,虽然我无法解决。 e.g。

SELECT 
    C.ID, C.Name,
    C.OldID, C.SpouseID, C.SpouseName,
    R.Result
FROM 
    CommunityMembers C
JOIN 
    CASE WHEN Date ... Type ... Group ... ELSE ... IN ... Not Exist ... THEN ... ActivityDates R

SELECT 
    C.ID, C.Name,
    C.OldID, C.SpouseID, C.SpouseName,
    CASE 
       WHEN R.Date ... R.Type ... R.Group ... ELSE ... THEN 'Yes' END AS Result
FROM 
    CommunityMembers C
JOIN 
    ActivityDates R

我怀疑我需要进行多次连接,但我不知道如何编写它。

谢谢

4 个答案:

答案 0 :(得分:0)

索引是这样的:

CREATE INDEX index_name
ON table_name (column1, column2, ...);

有关详细信息,请参阅此link

答案 1 :(得分:0)

这是使用“可选连接”的另一种模式,可能会或可能不会更好。它与你的输出并不完全相同 - 我不确定你在那之后是什么。

grant all on `db_name.*` to `'username'@'localhost'`;

在某些情况下,这会“重复计算”,但如果您确定整个ID的集合是唯一的,那么您应该没问题。如果您只想知道某个活动记录是否存在,那么您可以使用SELECT A.*, COALESCE(C1.Name, C2.Name, C3.Name) As Name FROM ActivityDates A LEFT OUTER JOIN CommunityMember As C1 ON C1.ID = A.ID LEFT OUTER JOIN CommunityMember As C2 ON C2.OldID = CAST(A.ID AS VARCHAR(12)) LEFT OUTER JOIN CommunityMember As C3 ON C2.SpouseID = CAST(A.ID AS VARCHAR(12)) 来加快速度,但我不会遵循您的逻辑。

答案 2 :(得分:0)

您需要每个ID的表ActivityDates中的信息。因此,按ID分组并在HAVING中过滤所需的ID:

SELECT ID 
FROM ActivityDates
WHERE ActivityDate > '2016-12-31'
GROUP BY ID
HAVING COUNT(CASE WHEN ActiviyType = 1 AND ActiviyGroup = 1 THEN 1 END) > 1
   AND COUNT(CASE WHEN ActiviyType > 1 AND ActiviyGroup > 1 THEN 1 END) = 0

您可以将其与EXISTS子句一起使用:

select
  c.*, 
  case when exists 
  (
    SELECT a.ID 
    FROM ActivityDates a
    WHERE a.ActivityDate > '2016-12-31'
      AND a.ID in (c.id, c.oldid, c.spouseid)
    GROUP BY a.ID
    HAVING COUNT(CASE WHEN ActiviyType = 1 AND ActiviyGroup = 1 THEN 1 END) > 1
       AND COUNT(CASE WHEN ActiviyType > 1 AND ActiviyGroup > 1 THEN 1 END) = 0
) then 'Yes' else '' end as result
from c;

加速提高速度的适当指数可能是

create index idx1 on ActivityDates (ID, ActivityDate, ActivityType, ActivityGroup);

create index idx2 on ActivityDates (ActivityDate, ID, ActivityType, ActivityGroup);

找出其中一个是否被使用并放弃另一个(或两者都被使用)。

使用非相关的子查询(这意味着我们必须多次访问)可能会表现得更好。如果它甚至涉及不同的执行计划,它取决于优化器:

with good_ids as
(
  select id 
  from activitydates
  where activitydate > '2016-12-31'
  group by id
  having count(case when activiytype = 1 and activiygroup = 1 then 1 end) > 1
     and count(case when activiytype > 1 and activiygroup > 1 then 1 end) = 0
)
select
  c.*,
  case when id       in (select id from good_ids)
         or oldid    in (select id from good_ids)
         or spouseid in (select id from good_ids)
       then 'Yes' else ''
  end as result
from c;

答案 3 :(得分:0)

您应该尝试解释输出。很难找到正确的商业。来自错误查询的规则。

这样你就可以从这里得到最好的查询。只是再次尝试解释为什么id 2,3是的。然后我将重写我的查询。

你要承诺的第二大错误就是不理解你的商业。规则,无需编写正确的查询,您将创建索引

试试这个,

declare @t table(ID varchar(20),Name varchar(40),OldID varchar(20), SpouseID  varchar(20)
, SpouseName varchar(40))
insert into @t VALUES
('1','John.Smith','o71' ,'s99','Jenna.Smith')
,('2','Jane.Doe' ,'o72',null,null)
,('3','Jessie.Jones',null,null,null)       

--select * from @t
declare @ActivityDates table(ID varchar(20), ActivityDate date
, ActiviyType int, ActivityGroup int)
insert into @ActivityDates VALUES
('1','2017-12-31',1, 1)
,('1','2017-12-31',3, 2)
,('1','2017-12-31',7, 1)
,('2','2017-12-31',1, 1)
,('3','2017-12-31',1, 1)
,('o72','2010-12-31',1, 2)
,('o72','2010-12-31',3, 1)
,('s99','2017-12-31',1, 1)
,('s99','2017-12-31',2, 1)

SELECT t.*
,case when tbl.id is not null then 'Yes' else null end Remarks
 from @t t
left JOIN
(select * from @ActivityDates AD
 WHERE(( ActivityDate > '2016-12-31' AND ActiviyType = 1 AND ActivityGroup = 1
 AND NOT EXISTS (SELECT ID FROM @ActivityDates ad1 WHERE (ad.id=ad1.id) AND
  ActivityDate > '2016-12-31' AND (ActiviyType > 1 or ActivityGroup > 1))
 )
  ))tbl
  on t.ID=tbl.ID