我有以下查询:
SELECT SUM(data), foreign_key
FROM (SELECT *
FROM really_big_table
ORDER BY auto_incremented_id DESC
LIMIT reasonable_number)
WHERE inserted_timestamp > now() - INTERVAL '1 hour'
GROUP BY foreign_key
此查询成功避免在inserted_timestamp上运行顺序扫描,但如果我需要检索的行数超过合理数量,则完全失败。由于inserted_timestamp没有被索引,但是遵循与auto_incremented_id相同的顺序,我觉得我可以使这个查询更有效率,而不会导致一小时的停机时间创建新索引。
我想做这样的事情:
SELECT SUM(data), foreign_key
FROM really_big_table
ORDER BY id DESC
STOP WHEN created < now() - INTERVAL '1 hour'
GROUP BY foreign_key
换句话说,我想要语法,以便我的查询将运行我的表的索引扫描,并在数据太旧时停止。
答案 0 :(得分:1)
加快速度的一种可能性是使用table partitioning,如果你还没有这样做的话。
这是另一个想法:
BEGIN;
DECLARE my_cursor NO SCROLL CURSOR FOR
SELECT data, foreign_key, inserted_timestamp
FROM really_big_table
ORDER BY id DESC;
FETCH FORWARD 5 FROM my_cursor;
-- Repeat as many times as you want
CLOSE my_cursor;
ROLLBACK; -- Or COMMIT
计算应用程序中的总和,或者,如果您想在数据库中执行此操作:
CREATE FUNCTION my_fetch() RETURNS SETOF really_big_table AS $$
DECLARE
-- You could also select only the relevant columns here and change
-- the function's return type.
curs CURSOR FOR
SELECT * FROM really_big_table ORDER BY id DESC;
BEGIN
FOR current_row IN curs LOOP
IF current_row.inserted_timestamp > CURRENT_TIMESTAMP - INTERVAL '1 hour' THEN
RETURN NEXT current_row;
ELSE
RETURN;
END IF;
END LOOP;
RETURN;
END
$$ STABLE LANGUAGE plpgsql;
然后你可以这样做:
SELECT SUM(data), foreign_key FROM my_fetch() GROUP BY foreign_key;