我有一个使用PostgreSQL generate_series函数的查询,但是当涉及到大量数据时,查询可能会很慢。生成查询的代码示例如下:
$yesterday = date('Y-m-d',(strtotime ( '-1 day' ) ));
$query = "
WITH interval_step AS (
SELECT gs::date AS interval_dt, random() AS r
FROM generate_series('$yesterday'::timestamp, '2015-01-01', '1 day') AS gs)
SELECT articles.article_id, article_title, article_excerpt, article_author, article_link, article_default_image, article_date_published, article_bias_avg, article_rating_avg
FROM development.articles JOIN interval_step ON articles.article_date_added::date=interval_step.interval_dt ";
if (isset($this -> registry -> get['category'])) {
$query .= "
JOIN development.feed_articles ON articles.article_id = feed_articles.article_id
JOIN development.rss_feeds ON feed_articles.rss_feed_id = rss_feeds.rss_feed_id
JOIN development.news_categories ON rss_feeds.news_category_id = news_categories.news_category_id
WHERE news_category_name = $1";
$params = array($category_name);
$query_name = 'browse_category';
}
$query .= " ORDER BY interval_step.interval_dt DESC, RANDOM() LIMIT 20;";
此系列仅查找前一天的内容,并按随机顺序对结果进行排序。我的问题是什么是generate_series可以优化以提高性能?
答案 0 :(得分:1)
Imho,请尝试删除random()
语句中的order by
。它可能比您想象的更大的性能影响。事情是它可能按interval_dt desc, random()
排序整套,然后挑选前20名。不可取......
尝试抓取,例如由interval_dt desc
排序的100行,然后根据相同的逻辑对它们进行随机播放,并在您的应用中选择20。或者将整个事物包装在子查询limit 100
中,然后沿着相同的行重新排序。
答案 1 :(得分:1)
根本不需要generate_series
。并且不要连接查询字符串。通过使参数为空字符串(或null)来避免它,如果未设置:
if (!isset($this -> registry -> get['category']))
$category_name = '';
$query = "
select articles.article_id, article_title, article_excerpt, article_author, article_link, article_default_image, article_date_published, article_bias_avg, article_rating_avg
from
development.articles
inner join
development.feed_articles using (article_id)
inner join
development.rss_feeds using (rss_feed_id)
inner join
development.news_categories using (news_category_id)
where
(news_category_name = $1 or $1 = '')
and articles.article_date_added >= current_date - 1
order by
date_trunc('day', articles.article_date_added) desc,
random()
limit 20;
";
$params = array($category_name);
将$yesterday
传递给查询也没有必要,因为它可以完全在SQL中完成。
如果$category_name
为空,则会返回所有类别:
(news_category_name = $1 or $1 = '')