我们在Amazon EC2上使用PostgreSQL版本9.4数据库。我们的所有查询在第一次尝试时都运行速度非常慢,直到它被缓存,之后它们非常快,但它不是调解,因为它会减慢页面加载速度。
我们在其中一个查询中使用:
SELECT HE.fs_perm_sec_id,
HE.TICKER_EXCHANGE,
HE.proper_name,
OP.shares_outstanding,
(SELECT factset_industry_desc
FROM factset_industry_map AS fim
WHERE fim.factset_industry_code = HES.industry_code) AS industry,
(SELECT SUM(POSITION) AS ST_HOLDINGS
FROM OWN_STAKES_HOLDINGS S
WHERE S.POSITION > 0
AND S.fs_perm_sec_id = HE.fs_perm_sec_id
GROUP BY FS_PERM_SEC_ID) AS stake_holdings,
(SELECT SUM(CURRENT_HOLDINGS)
FROM
(SELECT CURRENT_HOLDINGS
FROM OWN_INST_HOLDINGS IHT
WHERE FS_PERM_SEC_ID=HE.FS_PERM_SEC_ID
ORDER BY CURRENT_HOLDINGS DESC LIMIT 10)A) AS top_10_inst_hodings,
(SELECT SUM(OIH.current_holdings)
FROM own_inst_holdings OIH
WHERE OIH.fs_perm_sec_id = HE.fs_perm_sec_id) AS inst_holdings
FROM own_prices OP
JOIN h_security_ticker_exchange HE ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector HES ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
ORDER BY OP.price_date DESC LIMIT 1
运行EXPLAIN ANALYZE并收到以下结果:
QUERY PLAN
Limit (cost=223.39..223.39 rows=1 width=100) (actual time=2420.644..2420.645 rows=1 loops=1)
-> Sort (cost=223.39..223.39 rows=1 width=100) (actual time=2420.643..2420.643 rows=1 loops=1)
Sort Key: op.price_date
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=0.26..223.39 rows=1 width=100) (actual time=2316.169..2420.566 rows=36 loops=1)
-> Nested Loop (cost=0.17..8.87 rows=1 width=104) (actual time=3.958..5.084 rows=36 loops=1)
-> Index Scan using h_sec_exch_factset_entity_id_idx on h_security_ticker_exchange he (cost=0.09..4.09 rows=1 width=92) (actual time=1.452..1.454 rows=1 loops=1)
Index Cond: ((ticker_exchange)::text = 'PG-NYS'::text)
-> Index Scan using alex_prices on own_prices op (cost=0.09..4.68 rows=33 width=23) (actual time=2.496..3.592 rows=36 loops=1)
Index Cond: ((fs_perm_sec_id)::text = (he.fs_perm_sec_id)::text)
-> Index Scan using alex_factset_entity_idx on h_entity_sector hes (cost=0.09..4.09 rows=1 width=14) (actual time=0.076..0.077 rows=1 loops=36)
Index Cond: (factset_entity_id = he.factset_entity_id)
SubPlan 1
-> Index Only Scan using alex_factset_industry_code_idx on factset_industry_map fim (cost=0.03..2.03 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=36)
Index Cond: (factset_industry_code = hes.industry_code)
Heap Fetches: 0
SubPlan 2
-> GroupAggregate (cost=0.08..2.18 rows=2 width=17) (actual time=0.735..0.735 rows=1 loops=36)
Group Key: s.fs_perm_sec_id
-> Index Only Scan using own_stakes_holdings_perm_position_idx on own_stakes_holdings s (cost=0.08..2.15 rows=14 width=17) (actual time=0.080..0.713 rows=39 loops=36)
Index Cond: ((fs_perm_sec_id = (he.fs_perm_sec_id)::text) AND (\position\ > 0::numeric))
Heap Fetches: 1155
SubPlan 3
-> Aggregate (cost=11.25..11.26 rows=1 width=6) (actual time=0.166..0.166 rows=1 loops=36)
-> Limit (cost=0.09..11.22 rows=10 width=6) (actual time=0.081..0.150 rows=10 loops=36)
-> Index Only Scan Backward using alex_current_holdings_idx on own_inst_holdings iht (cost=0.09..194.87 rows=175 width=6) (actual time=0.080..0.147 rows=10 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 288
SubPlan 4
-> Aggregate (cost=194.96..194.96 rows=1 width=6) (actual time=66.102..66.102 rows=1 loops=36)
-> Index Only Scan using alex_current_holdings_idx on own_inst_holdings oih (cost=0.09..194.87 rows=175 width=6) (actual time=0.060..65.209 rows=2505 loops=36)
Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
Heap Fetches: 33453
Planning time: 1.581 ms
Execution time: 2420.830 ms
一旦我们为3个聚合禁用了SELECT SUM(),它会大大加快速度,但它会破坏关系数据库的作用。
我们使用PG插件(https://www.npmjs.com/package/pg)在NodeJS上运行查询,以连接并运行数据库上的查询
我们如何加快查询速度?我们可以采取哪些额外步骤?我们已经将数据库编入索引,所有字段似乎都被正确编入索引,但仍然不够快。
感谢任何帮助,意见和/或建议。
答案 0 :(得分:0)
带聚合的嵌套循环通常是一件坏事。以下应该避免这种情况。 (未经测试; SQLFiddle会有所帮助。)给我一个旋转让我知道。我很好奇引擎如何使用窗口函数过滤器。
WITH security
AS (
SELECT HE.fs_perm_sec_id
, HE.TICKER_EXCHANGE
, HE.proper_name
, OP.shares_outstanding
, OP.price_date
FROM own_prices AS OP
JOIN h_security_ticker_exchange AS HE
ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector AS HES
ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
)
SELECT SE.fs_perm_sec_id
, SE.TICKER_EXCHANGE
, SE.proper_name
, SE.shares_outstanding
, S.stake_holdings
, IHT.top_10_inst_holdings
, OIH.inst_holdings
FROM security SE
JOIN (
SELECT S.fs_perm_sec_id
, SUM(S.POSITION) AS stake_holdings
FROM OWN_STAKES_HOLDINGS AS S
WHERE S.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
AND S.POSITION > 0
GROUP BY S.fs_perm_sec_id
) AS S
ON SE.fs_perm_sec_id = S.fs_perm_sec_id
JOIN (
SELECT IHT.FS_PERM_SEC_ID
, SUM(IHT.CURRENT_HOLDINGS) AS top_10_inst_holdings
FROM OWN_INST_HOLDINGS AS IHT
WHERE IHT.FS_PERM_SEC_ID IN (
SELECT fs_perm_sec_id
FROM security
)
AND ROW_NUMBER() OVER (
PARTITION BY IHT.FS_PERM_SEC_ID
ORDER BY IHT.CURRENT_HOLDINGS DESC
) <= 10
GROUP BY IHT.FS_PERM_SEC_ID
) AS IHT
ON SE.fs_perm_sec_id = IHT.fs_perm_sec_id
JOIN (
SELECT S.fs_perm_sec_id
, SUM(OIH.current_holdings) AS inst_holdings
FROM own_inst_holdings AS OIH
WHERE OIH.fs_perm_sec_id IN (
SELECT fs_perm_sec_id
FROM security
)
GROUP BY OIH.fs_perm_sec_id
) AS OIH
ON SE.fs_perm_sec_id = OIH.fs_perm_sec_id
ORDER BY SE.price_date
LIMIT 1