MySQL查询在非常大的一组事务中查找3个相同的事务

时间:2019-04-25 19:10:38

标签: mysql

我们有一个交易表,其中包含金额,用户ID,合作伙伴ID和说明。当用户有3笔特定交易时,他们将获得一次性奖金。我们已经对较小的集合进行了查询,但是现在我们有超过20,000,000的交易行,只是没有削减。必须有更好的方法来解决这个问题。

SELECT t.str_spid, count(*) AS C
  FROM transactions t
 WHERE t.str_prid = 150
   AND (t.str_desc = 'Annual Rewards' OR t.str_desc = 'Annual Rewards (PRO)')
   AND t.str_amount = 1000
   AND t.str_spid NOT IN (
        SELECT x.str_spid 
          FROM transactions x
         WHERE x.str_prid = 150
           AND x.str_amount = 2500
      GROUP BY x.str_spid
      )
GROUP BY t.STR_SPID 
HAVING C = 3

我的代码一次只能限制50行,但即使这样也令人窒息。该组中大约有300万人,交易总数约为1200万。仅子查询就不足1,000,000行。

一个想法是标记已经收到奖金的用户,以便以后跳过他们。它将消除对子查询的需求,但是我必须加入到用户表(t.str_serid = u.us_userid)来查看是否设置了该标志。

任何提示,指示,想法等,将不胜感激。

2 个答案:

答案 0 :(得分:1)

如果检查索引没有提供解决方案,则这是我建议尝试使用相关子查询的情况之一。改变这个:

   AND t.str_spid NOT IN (
        SELECT x.str_spid 
          FROM transactions x
         WHERE x.str_prid = 150
           AND x.str_amount = 2500
      GROUP BY x.str_spid
      )

对此:

   AND NOT EXISTS (
        SELECT x.str_spid 
        FROM transactions x
        WHERE x.str_spid = t.str_spid
           AND x.str_prid = 150
           AND x.str_amount = 2500
      )

此外,尽管我怀疑会有所不同,但是DISTINCT比没有聚合功能的GROUP BY更适合您的原始子查询。


另一个选择是LEFT JOIN,这实际上是我更可能的首选:

SELECT t.str_spid, count(*) AS C
FROM transactions AS t
LEFT JOIN transactions AS x
   ON t.str_spid = x.str_spid
   AND x.str_prid = 150
   AND x.str_amount = 2500
WHERE t.str_prid = 150
   AND (t.str_desc = 'Annual Rewards' OR t.str_desc = 'Annual Rewards (PRO)')
   AND t.str_amount = 1000
   AND x.str_spid IS NULL
GROUP BY t.STR_SPID 
HAVING C = 3

答案 1 :(得分:0)

确保索引正确

table transactions 

a composite index  on columns (str_prid,str_prid)

一秒钟

composite index  on columns ( str_desc, str_amount,str_spid  )