SQL-如何获取大量数据而不是迭代每个查询

时间:2014-09-25 20:54:50

标签: python sql sql-server sql-server-2012

我收集每个项目的项目列表,我使用SQL查询检查数据库,并使用以下代码:

SELECT * 
FROM task_activity as ja 
join task as j on ja.task_id = j.id 
WHERE j.name = '%s' 
  AND ja.avg_runtime <> 0 
  AND ja.avg_runtime is not NULL 
  AND ja.id = (SELECT MAX(id) FROM task_activity 
               WHERE task_id = ja.task_id 
                 and avg_runtime <> 0 
                 AND ja.avg_runtime is not NULL) 
  % str(task.get('name'))).fetchall()

但我是否需要遍历列表并对每个人进行查询。这个清单有时非常大。我可以只进行一次查询并获取列表数据集吗? 在此特定查询中,我只查找具有任务ID的列avg_runtime,并且最大id将是最后计算的运行时。

我无权访问数据库,然后进行查询。使用Microsoft SQL Server 2012(SP1) - 11.0.3349.0(X64)

1 个答案:

答案 0 :(得分:1)

您可以使用row_number()加快速度。请注意,我认为原始查询中存在错误。子查询中的ja.avg_runtime应该是avg_runtime吗?

sql = """with x as (
    select
        task_id,
        avg_runtime,
        id,
        row_number() over (partition by ja.task_id order by ja.id desc) rn
    from
        task_activity as ja 
            join 
        task as j 
            on ja.task_id = j.id 
    where
        j.name in ({0}) and
        ja.avg_runtime <> 0 and
        ja.avg_runtime is not null        
) select
    task_id,
    avg_runtime,
    id
from
    x
where
    rn = 1;"""

# build up ?,?,? for parameter substitution
# assume tasknames is the list containing the task names.
params = ",".join(tasknames.map(lambda x: "?"))
# connection is your db connection
cursor = connection.cursor()
# interpolate the ?,?,? and bind parameters
cursor.execute(sql.format(params), tasknames)
cursor.fetchall()

以下索引应该使此查询非常快(尽管它取决于ja.avg_runtime上的过滤器排除了多少行):

create index ix_task_id_id on task_activity (task_id, id desc);