Question

我有一个大表'分数'，包含超过1亿行，格式如下：

(critic_id, book_id, score)

我有一个主键约束：

CONSTRAINT pk_scoresid PRIMARY KEY (critic_id, book_id)

大约有225,000本书，约有500位评论家。

进行如下查询：

SELECT *
FROM scores s
WHERE s.critic_id = ANY(array[1,2,3,4,5])

上述查询返回约120万行。这在我的本地机器上大约需要35秒。如果可能的话，我真的希望它是<1秒，因为我想做一些后期计算并将其发送回我的前端。有没有办法加快查询速度？

对于每个评论家的单独执行下面的查询需要大约5.5秒（这对我的应用来说仍然太长）：

SELECT * FROM scores s WHERE s.critic_id = 1          /* or 2, 3, 4.. */

EDIT1 ：

输出：

EXPLAIN SELECT * FROM scores s WHERE s.critic_id in (1, 2, 3, 4, 5);

"Bitmap Heap Scan on scores s  (cost=33998.40..658736.47 rows=1180328 width=16)"
"  Recheck Cond: (critic_id = ANY ('{1,2,3,4,5}'::integer[]))"
"  ->  Bitmap Index Scan on pk_scoresid  (cost=0.00..33703.32 rows=1180328 width=0)"
"        Index Cond: (critic_id = ANY ('{1,2,3,4,5}'::integer[]))"

EDIT2 ：

尝试了以下方法，但这并未提高性能：

CREATE INDEX score_index
  ON score
  USING btree
  (critic_id);
ALTER TABLE score CLUSTER ON score_index;

EXPLAIN SELECT * FROM scores s WHERE s.critic_id in (1, 2, 3, 4, 5);

"Bitmap Heap Scan on score s  (cost=22183.58..646085.28 rows=1188223 width=16)"
"  Recheck Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"
"  ->  Bitmap Index Scan on scores_index  (cost=0.00..21886.53 rows=1188223 width=0)"
"        Index Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"

EDIT3 ：

EXPLAIN (analyze, verbose) SELECT * FROM scores WHERE s.critic_id = 1 OR s.critic_id = 2 OR s.critic_id = 3 OR s.critic_id = 4 OR s.critic_id = 5

"Bitmap Heap Scan on public.scores s  (cost=23433.49..654761.58 rows=1183187 width=16) (actual time=145.373..7078.141 rows=1121375 loops=1)"
"  Output: critic_id, book_id, score"
"  Recheck Cond: ((s.critic_id = 1) OR (s.critic_id = 2) OR (s.critic_id = 3) OR (s.critic_id = 4) OR (s.critic_id = 5))"
"  Rows Removed by Index Recheck: 33440779"
"  Heap Blocks: exact=43398 lossy=185726"
"  ->  BitmapOr  (cost=23433.49..23433.49 rows=1188223 width=0) (actual time=137.729..137.729 rows=0 loops=1)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=60.175..60.175 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 1)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.473..18.473 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 2)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=21.429..21.429 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 3)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.918..18.918 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 4)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..5493.86 rows=297239 width=0) (actual time=18.729..18.729 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 5)"

Answer 1

如果这是您应用于表上的查询的常用谓词，在该表中您选择了相当小的批评子集，那么在sorted_id索引的情况下，您可以将该表上的表聚集在一起以实际求助它。

这将共同定位具有相同值的critic_id的所有行，并提高使用该索引的可能性和它的性能。

如果表中的book_id已经存在一些固有的聚类，则可能会损害book_id选择的查询的性能。

Answer 2

可能你应该执行数据非规范化：

public void createPartControl( Composite parent ) 
{
	CustomAction lCustomAction = new CustomAction( parent , 1 , this);		
	
	lCustomAction.setText("Load content with index 1");
	lCustomAction.setImageDescriptor(Activator.getImageDescriptor("icons/sample.gif"));		
	getViewSite().getActionBars().getMenuManager().add(lCustomAction);
	
	GridLayout layout = new GridLayout(2, false);
	parent.setLayout(layout);	  
	
	Thread t = new Thread( new RunSMV());
	t.start();
}
public class CustomAction extends Action implements IWorkbenchAction
{
	private Composite parent;
	private int index;
	SampleView sv;
	public CustomAction(Composite parent, int index, SampleView sv)
	{
		setId(ID);
		this.parent = parent;
		this.index = index;
		this.sv = sv;	
	}

	public void run() 
	{
		try
		{	    	
			sv.createViewer(parent, index);			
		} catch (IOException | ParserConfigurationException | SAXException e) {
			
			e.printStackTrace();
		}		
	}
	public void dispose() {}	
}
public void createViewer(Composite parent, int index) throws IOException, ParserConfigurationException, SAXException
{
	ArrayList<ArrayList<CounterExample>> listAll = RunNuSMV.RunNuSMV();
	ArrayList<CounterExample> list = listAll.get(index);
	int numeroColonne = list.get(1).getN();
	
	viewer = new TableViewer( parent , SWT.MULTI | SWT.H_SCROLL
			| SWT.V_SCROLL | SWT.FULL_SELECTION | SWT.BORDER);
			
	createColumns( numeroColonne , parent , viewer);
	final Table table = viewer.getTable();	
	table.setLinesVisible(true);
	viewer.setContentProvider(new ArrayContentProvider());
	TableColumn column[] = new TableColumn[numeroColonne];
	for(int k=0;k < numeroColonne;k++ )
	{
		 column[k] = new TableColumn(table, SWT.NONE);
	}
					
	CounterExample oldRow = list.get(0);
	CounterExample row;
	
	for (int i = 0; i < list.size(); i++) 
	{
		row = list.get(i);		
		TableItem item1 = new TableItem(table, SWT.NONE);
		item1.setText(row.getRow());
		
		for(int j=0; j < numeroColonne;j++)
		{
			if(j != 0 && i >= 2 && row.getVar(j).equals(oldRow.getVar(j))==false )
			{
				Display disp = Display.getCurrent();
				Color yellow = disp.getSystemColor(SWT.COLOR_YELLOW);
				item1.setBackground( j , yellow );
			}
		}		
		oldRow = row;					
	}

	for(int k=0;k<numeroColonne;k++)
	{
		column[k].pack();
	}
	
	getSite().setSelectionProvider(viewer);
	GridData gridData = new GridData();
	
	gridData.verticalAlignment = GridData.FILL;			    
	gridData.horizontalSpan = 2;
	gridData.grabExcessHorizontalSpace = true;
	gridData.grabExcessVerticalSpace = true;
	gridData.horizontalAlignment = GridData.FILL;
	
	viewer.getControl().setLayoutData(gridData);
}

当然它会导致插入和更新问题，但它会从根本上减少表大小。

Answer 3

尝试以下查询：

SELECT *
FROM scores s
WHERE s.critic_id = 1
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 2
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 3
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 4
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 5

和（最终）仅在critic_id上添加索引。

Postgres slow WHERE id是100M行表

3 个答案: