Question

我收集每个项目的项目列表，我使用SQL查询检查数据库，并使用以下代码：

SELECT * 
FROM task_activity as ja 
join task as j on ja.task_id = j.id 
WHERE j.name = '%s' 
  AND ja.avg_runtime <> 0 
  AND ja.avg_runtime is not NULL 
  AND ja.id = (SELECT MAX(id) FROM task_activity 
               WHERE task_id = ja.task_id 
                 and avg_runtime <> 0 
                 AND ja.avg_runtime is not NULL) 
  % str(task.get('name'))).fetchall()

但我是否需要遍历列表并对每个人进行查询。这个清单有时非常大。我可以只进行一次查询并获取列表数据集吗？在此特定查询中，我只查找具有任务ID的列avg_runtime，并且最大id将是最后计算的运行时。

我无权访问数据库，然后进行查询。使用Microsoft SQL Server 2012（SP1） - 11.0.3349.0（X64）

Answer 1

您可以使用row_number()加快速度。请注意，我认为原始查询中存在错误。子查询中的ja.avg_runtime应该是avg_runtime吗？

sql = """with x as (
    select
        task_id,
        avg_runtime,
        id,
        row_number() over (partition by ja.task_id order by ja.id desc) rn
    from
        task_activity as ja 
            join 
        task as j 
            on ja.task_id = j.id 
    where
        j.name in ({0}) and
        ja.avg_runtime <> 0 and
        ja.avg_runtime is not null        
) select
    task_id,
    avg_runtime,
    id
from
    x
where
    rn = 1;"""

# build up ?,?,? for parameter substitution
# assume tasknames is the list containing the task names.
params = ",".join(tasknames.map(lambda x: "?"))
# connection is your db connection
cursor = connection.cursor()
# interpolate the ?,?,? and bind parameters
cursor.execute(sql.format(params), tasknames)
cursor.fetchall()

以下索引应该使此查询非常快（尽管它取决于ja.avg_runtime上的过滤器排除了多少行）：

create index ix_task_id_id on task_activity (task_id, id desc);

SQL-如何获取大量数据而不是迭代每个查询

1 个答案: