复杂的SQL查询调整

时间:2019-05-29 10:40:43

标签: sql performance vertica

您好,我有一个查询,该查询要花费大量时间来执行并消耗大量资源,该查询与此类似:

<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<select name="company1" class="dropdown">
  <option value="1">Group 1</option>
  <option value="2">Group 2</option>
</select>
<select name="company2" class="dropdown">
  <option value="1">Group 1</option>
  <option value="2">Group 2</option>
</select>
<select name="company3" class="dropdown">
  <option value="1">Group 1</option>
  <option value="2">Group 2</option>
</select>

在我有两个表并尝试在四个可能的条件下进行联接的地方,第一个CTE包含第一个条件的结果,第二个CTE包含第一个条件中未包含的条目,并符合第二个条件,第三部分包含也符合第三条件等的其余部分。 因此使用 WITH first_match AS ( SELECT t1.name , t1.lastname , t2.location FROM table1 t1 INNER JOIN table2 t2 ON t1.name = t2.name ) , second_match AS ( SELECT t1.name , t1.lastname , t2.location FROM table1 t1 INNER JOIN table2 t2 ON t1.name = SUBSTR(t2.name,0,INSTR(t2.name,'-')) WHERE REGEXP_LIKE(t2.name,'[-]') AND (t1.name, t1.lastname) NOT IN (SELECT name, lastname FROM first_match) ) , third_match AS ( SELECT t1.name ,t1.lastname ,t2.location FROM table1 t1 INNER JOIN table2 t2 ON t1.name = REGEXP_REPLACE(t2.name, 'uselesssuffix', '') WHERE REGEXP_LIKE(t1.name, 'uselesssuffix') AND (t1.name, t1.lastname) NOT IN ( SELECT name, lastname FROM first_match UNION SELECT name, lastname FROM second_match ) ) , fourth_match AS ( SELECT t1.name ,t1.lastname ,t2.location FROM table1 t1 INNER JOIN table2 t2 ON t1.name = SUBSTR(t2.name, 0, 7) WHERE LENGTH(t2.name) > 6 AND (t1.name, t1.lastname) NOT IN ( SELECT name, lastname FROM first_match UNION SELECT name, lastname FROM second_match UNION SELECT name, lastname FROM third_match ) ) , final_result AS ( SELECT * FROM first_match UNION SELECT * FROM second_match UNION SELECT * FROM third_match UNION SELECT * FROM fourth_match ) SELECT * FROM final_result; 条件, 最终我使用UNION生成最终结果,现在CTE的数量相对较多(在这种情况下为5),而UNION的数量也很大,而且条件使此查询变成繁重的查询,并且占用了如此多的资源和如此多的时间执行, 关于是否有更好的方法来实施相同场景但采用不同方法的任何想法,也许是在一个CTE中

2 个答案:

答案 0 :(得分:1)

我在这里出问题了吗?

在包含多个布尔表达式进行或运算的复杂联接中,计算从求值为TRUE的表达式的第一个表达式停止。

那为什么不呢?

    SELECT   t1.name
           , t1.lastname
           , t2.location
    FROM table1 t1
    INNER JOIN
    table2 t2 
    ON t1.name = t2.name
    OR  (    -- REGEXP_LIKE(t2.name,'[-]')
             -- AND t1.name = SUBSTR(t2.name,1,INSTR(t2.name,'-')-1)
         t1.name=SPLIT_PART(t2.name,'-',1)
    )
    OR  (    -- REGEXP_LIKE(t1.name, 'uselesssuffix') 
             -- AND t1.name = REGEXP_REPLACE(t2.name, 'uselesssuffix', '')
         t1.name=SPLIT_PART(t2.name,'uselesssuffix',1)
    )
    OR  (    LENGTH(t2.name) > 6
         AND t1.name = SUBSTR(t2.name, 1, 7)
    )

..而在我尝试时,我试图减少必要的函数调用...

然后,在Vertica中,SUBSTR()从字符串的第1个位置开始,而不是第0个位置。.:-]

答案 1 :(得分:0)

您有多个要优先排序的匹配项。要尝试的一件事是一系列LEFT JOIN

SELECT t1.name, t1.lastname,
       COALESCE(t2_1.location, t2_2.location, , . . . )
FROM table1 t1 LEFT JOIN
     table2 t2_1 
     ON t1.name = t2_1.name LEFT JOIN
     table2 t2_2
     ON t1.name = SUBSTR(t2_2.name, 1, INSTR(t2_2.name, '-')) AND
        t2_2.name LIKE '%-%' AND
        t2_1.name IS NULL LEFT JOIN
     table2 t2_3
     ON t1.name = REGEXP_REPLACE(t2_3.name, 'uselesssuffix', '') AND
        t1.name LIKE '%uselesssuffix%' AND
        t2_1.name IS NULL AND
        t2_2.name IS NULL LEFT JOIN
     . . .
WHERE t1.name IS NOT NULL OR
      t2_1.name IS NOT NULL OR
      . . . 

请注意,每个新条件都会检查所有先前条件是否不匹配。