我有以下表格和数据:
T_USER
ID | COUNTRY_NAME
---------------------------
101 FRANCE
102 GERMANY
103 ITALY
104 FRANCE
105 ITALY
106 FRANCE
107 GERMANY
108 ITALY
109 FRANCE
110 ITALY
T_LOG_ACCESS
ID | APPLICATION | ACCESS_DATE
-------------------------------------------
101 Portal-M 10/6/2017
102 Portal-H 10/6/2017
103 Portal-E 10/6/2017
104 Portal-E 10/6/2017
101 Portal-M 10/6/2017
102 Portal-E 10/6/2017
103 Portal-E 10/6/2017
104 Portal-E 10/6/2017
105 Portal-M 10/6/2017
106 Portal-E 10/6/2017
107 Portal-E 10/6/2017
108 Portal-E 10/6/2017
104 Portal-E 10/6/2017
105 Portal-E 10/6/2017
106 Portal-E 10/6/2017
101 Portal-M 11/6/2017
102 Portal-H 11/6/2017
102 Portal-E 11/6/2017
104 Portal-E 11/6/2017
105 Portal-M 11/6/2017
105 Portal-E 11/6/2017
107 Portal-E 11/6/2017
107 Portal-E 11/6/2017
108 Portal-E 11/6/2017
T_ROLES
USER | ROLE
--------------------
101 M_ACT
101 E_ACT
102 H_ACT
102 E_ACT
103 E_ACT
104 E_ACT
105 M_ACT
105 E_ACT
106 E_ACT
107 E_ACT
108 E_ACT
109 E_ACT
110 M_ACT
110 E_ACT
我试图统计那些在两个月内访问门户网站的用户(按国家/地区分组),即10月份访问过的用户和11月份再次访问过的用户。
我正在尝试使用以下查询,但由于大量实际数据,查询执行时间高达15分钟:
select
COUNTRY_NAME,
count(DISTINCT CASE WHEN SUB1.APPLICATION='Portal-M' and SUB2.role='M_ACT' THEN SUB1.id END)Manager_Count,
count(DISTINCT CASE WHEN SUB1.APPLICATION='Portal-H' and SUB2.role='H_ACT' THEN SUB1.id END)HR_Count,
count(DISTINCT CASE WHEN SUB1.APPLICATION='Portal-H' and SUB2.role='E_ACT' THEN SUB1.id END) Employee_COUNT
from
T_USER MAIN
INNER JOIN T_LOG_ACCESS SUB1
ON MAIN.id=SUB1.id
AND TO_DATE(to_char(SUB1.access_date,'DD-MON-YYYY'),'DD-MON-YYYY') between
--Report 1st Time Period:
TO_DATE('20171101','YYYYMMDD')and TO_DATE('20171130','YYYYMMDD')
INNER JOIN T_ROLES SUB2
ON MAIN.id=SUB2.user
AND SUB2.user in
(SELECT DISTINCT SUB7.id
from T_LOG_ACCESS SUB7,
T_ROLES SUB8
where SUB7.APPLICATION=SUB1.APPLICATION
AND SUB8.role=SUB2.role
AND TO_DATE(to_char(SUB7.access_date,'DD-MON-YYYY'),'DD-MON-YYYY') between
--Report 2nd Time Period:
TO_DATE('20171001','YYYYMMDD')and TO_DATE('20171031','YYYYMMDD') )
group by COUNTRY_NAME;
有没有办法让这个查询更快?请帮忙。
答案 0 :(得分:0)
提供查询调优建议而无需查看解释计划或知道数据量和倾斜是一个杯子的游戏。但是这里有。
您的代码存在几个明显的问题。
T_LOG_ACCESS SUB7
和T_ROLES SUB8
之间没有连接条件,因此查询将是一个笛卡尔积,然后您可以使用DISTINCT进行缩减。那就是在那里浪费了很多紧缩。考虑到有关您的数据量和偏差的一些假设,可能更快:
with SUB1 as (
select id
, application
from T_LOG_ACCESS
where access_date >= date '2017-11-01'
and access_date <= date '2017-11-30'
)
, SUB7 as (
select id
, application
from T_LOG_ACCESS
where access_date >= date '2017-01-01'
and access_date <= date '2017-01-31'
)
select
COUNTRY_NAME,
count(DISTINCT CASE WHEN SUB1.APPLICATION='Portal-M' and SUB2.role='M_ACT' THEN SUB1.id END) Manager_Count,
count(DISTINCT CASE WHEN SUB1.APPLICATION='Portal-H' and SUB2.role='H_ACT' THEN SUB1.id END) HR_Count,
count(DISTINCT CASE WHEN SUB1.APPLICATION='Portal-H' and SUB2.role='E_ACT' THEN SUB1.id END) Employee_COUNT
from
T_USER MAIN
INNER JOIN T_LOG_ACCESS SUB1
ON MAIN.id=SUB1.id
INNER JOIN T_ROLES SUB2
ON MAIN.id=SUB2.user
where SUB2.user in
(SELECT SUB7.id
from SUB7
where SUB7.APPLICATION=SUB1.APPLICATION )
group by COUNTRY_NAME;
注意:我保留了表别名,以使您对此更加透明,即使我同意@GoranStefanović这些内容非常糟糕,让您的查询更难理解。