Question

我有下一段代码，执行时间不超过3秒。

`CREATE TEMPORARY TABLE tmp
 SELECT
    MAX(date) as mdate
 FROM  table1
 WHERE
    date between "2017-03-13"
    and "2018-03-13"
    and client_id = "something"
    and field_id IN ("123","1234","12345")
GROUP BY DATE_FORMAT(date,'%x_%v');
SELECT 
    SUM(value),
    DATE_FORMAT(date,'%x_%v') as date
FROM 
  table1, tmp t
WHERE 
    date = t.mdate
and client_id = "something"
and field_id IN ("123","1234","12345")
GROUP BY date;
DROP TABLE tmp;`

但是当我尝试在一个查询中完成它时，它会在大约1分钟内执行4.36秒。

SELECT 
  SUM(value),
  mdates.grouping_date
FROM 
  (
     SELECT
       MAX(date) as mdate,
       DATE_FORMAT(a.date,'%x_%v') as grouping_date
     FROM  table1
     WHERE
        date between "2017-03-13"
        and "2018-03-13"
        and client_id = "something"
        and field_id IN ("123","1234","12345")
     GROUP BY grouping_date
) mdates, table1 a
WHERE 
   a.date = mdates.mdate
   and a.client_id = "something"
   and a.field_id IN ("123","1234","12345")
GROUP BY mdates.grouping_date;

为了让它作为第一个块运行得更快，我该怎么办？

我想也许我可以使用复合索引，但我已经尝试过这个但没有帮助。

create index my_idx on table1(date,field_id,client_id);

更新

解决我的问题的原因是创建了几个索引。

create index index1 on table1(client_id,field_id,date)
create index index2 on table2(date,value)

现在它的运行速度与使用临时表的第一个查询一样快。

但我不得不稍微改变一下这个问题。

SELECT
   SUM(value),
   DATE_FORMAT(date,'%x_%v') as date
FROM 
   table1 a FORCE INDEX(index2)
WHERE
   a.date in (
     SELECT
      MAX(date)
     FROM
      table1 FORCE INDEX(index1)
     WHERE
      client_id = "something"
      and repo_id IN ("123","1234","12345")
      and date >= "2018-02-11" 
      and date < "2018-03-13"
     GROUP BY DATE_FORMAT(date,'%v_%x')
) 
GROUP BY date';

Answer 1

对于您的查询，我将创建复合索引：

df_groinbar <- data.frame()
info <- data.frame()
for (i in list.files("/Users/Nicolas/Dropbox/Groin Bar/"))
{
  type <- str_extract(i, "([A-Z]+)")
  temp <- read_csv(i, skip = 6, col_names = c("elapsed_time", "left_squeeze", "right_squeeze", "left_pull", "right_pull"))
  info_temp <- select(read_csv(i, skip = 2, n_max = 1), 1:6)
  df_groinbar <- rbind(df_groinbar, temp)
  info <- rbind(info, info_temp)
}

create index my_idx on table1(client_id, field_id, date);条件首先在索引中，然后是其他条件。用于相等条件的列应该是第一个。

Answer 2

您可以使用单个查询

 SELECT
        DATE_FORMAT(MAX(date),'%x_%v') as date 
        , SUM(value)
     FROM  table1
     WHERE
        date between "2017-03-13"  and "2018-03-13"
        and client_id = "something"
        and field_id IN ("123","1234","12345")
    GROUP BY DATE_FORMAT(date,'%x_%v');

无论如何，您应该在日期使用复合索引，client_id，field_id

Answer 3

尝试将其分解为两个查询：
还要摆脱通配符（％）以提高性能

SELECT
           MAX(date) as mdate,
           DATE_FORMAT(a.date,'%x_%v') as grouping_date
    into #mdates
         FROM  table1
         WHERE
            date between "2017-03-13"
            and "2018-03-13"
            and client_id = "something"
            and field_id IN ("123","1234","12345")
         GROUP BY grouping_date



    SELECT 
      SUM(value),
      mdates.grouping_date
    FROM 
       #mdates mdates, table1 a
    WHERE 
       a.date = mdates.mdate
       and a.client_id = "something"
       and a.field_id IN ("123","1234","12345")
    GROUP BY mdates.grouping_date;

Answer 4

你尝试过CTE吗？

WITH DateM AS 
  (
     SELECT
     Client_ID,
       MAX(date) as mdate,
       DATE_FORMAT(a.date,'%x_%v') as grouping_date
     FROM  table1
     WHERE
        date between "2017-03-13"
        and "2018-03-13"
        and client_id = "something"
        and field_id IN ("123","1234","12345")
     GROUP BY client_id, grouping_date
)



SELECT 
  SUM(value),
  datem.grouping_date
FROM table1 join DateM on table1.client_id = DateM.Client_ID
and table1.date = datem.mdate
GROUP BY datem.grouping_date;

查询等效于临时表mysql

4 个答案: