我有一个大表'分数',包含超过1亿行,格式如下:
(critic_id, book_id, score)
我有一个主键约束:
CONSTRAINT pk_scoresid PRIMARY KEY (critic_id, book_id)
大约有225,000本书,约有500位评论家。
进行如下查询:
SELECT *
FROM scores s
WHERE s.critic_id = ANY(array[1,2,3,4,5])
上述查询返回约120万行。这在我的本地机器上大约需要35秒。如果可能的话,我真的希望它是<1秒,因为我想做一些后期计算并将其发送回我的前端。有没有办法加快查询速度?
对于每个评论家的单独执行下面的查询需要大约5.5秒(这对我的应用来说仍然太长):
SELECT * FROM scores s WHERE s.critic_id = 1 /* or 2, 3, 4.. */
EDIT1 :
输出:
EXPLAIN SELECT * FROM scores s WHERE s.critic_id in (1, 2, 3, 4, 5);
"Bitmap Heap Scan on scores s (cost=33998.40..658736.47 rows=1180328 width=16)"
" Recheck Cond: (critic_id = ANY ('{1,2,3,4,5}'::integer[]))"
" -> Bitmap Index Scan on pk_scoresid (cost=0.00..33703.32 rows=1180328 width=0)"
" Index Cond: (critic_id = ANY ('{1,2,3,4,5}'::integer[]))"
EDIT2 :
尝试了以下方法,但这并未提高性能:
CREATE INDEX score_index
ON score
USING btree
(critic_id);
ALTER TABLE score CLUSTER ON score_index;
EXPLAIN SELECT * FROM scores s WHERE s.critic_id in (1, 2, 3, 4, 5);
"Bitmap Heap Scan on score s (cost=22183.58..646085.28 rows=1188223 width=16)"
" Recheck Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"
" -> Bitmap Index Scan on scores_index (cost=0.00..21886.53 rows=1188223 width=0)"
" Index Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"
EDIT3 :
EXPLAIN (analyze, verbose) SELECT * FROM scores WHERE s.critic_id = 1 OR s.critic_id = 2 OR s.critic_id = 3 OR s.critic_id = 4 OR s.critic_id = 5
"Bitmap Heap Scan on public.scores s (cost=23433.49..654761.58 rows=1183187 width=16) (actual time=145.373..7078.141 rows=1121375 loops=1)"
" Output: critic_id, book_id, score"
" Recheck Cond: ((s.critic_id = 1) OR (s.critic_id = 2) OR (s.critic_id = 3) OR (s.critic_id = 4) OR (s.critic_id = 5))"
" Rows Removed by Index Recheck: 33440779"
" Heap Blocks: exact=43398 lossy=185726"
" -> BitmapOr (cost=23433.49..23433.49 rows=1188223 width=0) (actual time=137.729..137.729 rows=0 loops=1)"
" -> Bitmap Index Scan on scores_index (cost=0.00..4115.16 rows=222746 width=0) (actual time=60.175..60.175 rows=224275 loops=1)"
" Index Cond: (s.critic_id = 1)"
" -> Bitmap Index Scan on scores_index (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.473..18.473 rows=224275 loops=1)"
" Index Cond: (s.critic_id = 2)"
" -> Bitmap Index Scan on scores_index (cost=0.00..4115.16 rows=222746 width=0) (actual time=21.429..21.429 rows=224275 loops=1)"
" Index Cond: (s.critic_id = 3)"
" -> Bitmap Index Scan on scores_index (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.918..18.918 rows=224275 loops=1)"
" Index Cond: (s.critic_id = 4)"
" -> Bitmap Index Scan on scores_index (cost=0.00..5493.86 rows=297239 width=0) (actual time=18.729..18.729 rows=224275 loops=1)"
" Index Cond: (s.critic_id = 5)"
答案 0 :(得分:0)
如果这是您应用于表上的查询的常用谓词,在该表中您选择了相当小的批评子集,那么在sorted_id索引的情况下,您可以将该表上的表聚集在一起以实际求助它。
这将共同定位具有相同值的critic_id的所有行,并提高使用该索引的可能性和它的性能。
如果表中的book_id已经存在一些固有的聚类,则可能会损害book_id选择的查询的性能。
答案 1 :(得分:0)
可能你应该执行数据非规范化:
public void createPartControl( Composite parent )
{
CustomAction lCustomAction = new CustomAction( parent , 1 , this);
lCustomAction.setText("Load content with index 1");
lCustomAction.setImageDescriptor(Activator.getImageDescriptor("icons/sample.gif"));
getViewSite().getActionBars().getMenuManager().add(lCustomAction);
GridLayout layout = new GridLayout(2, false);
parent.setLayout(layout);
Thread t = new Thread( new RunSMV());
t.start();
}
public class CustomAction extends Action implements IWorkbenchAction
{
private Composite parent;
private int index;
SampleView sv;
public CustomAction(Composite parent, int index, SampleView sv)
{
setId(ID);
this.parent = parent;
this.index = index;
this.sv = sv;
}
public void run()
{
try
{
sv.createViewer(parent, index);
} catch (IOException | ParserConfigurationException | SAXException e) {
e.printStackTrace();
}
}
public void dispose() {}
}
public void createViewer(Composite parent, int index) throws IOException, ParserConfigurationException, SAXException
{
ArrayList<ArrayList<CounterExample>> listAll = RunNuSMV.RunNuSMV();
ArrayList<CounterExample> list = listAll.get(index);
int numeroColonne = list.get(1).getN();
viewer = new TableViewer( parent , SWT.MULTI | SWT.H_SCROLL
| SWT.V_SCROLL | SWT.FULL_SELECTION | SWT.BORDER);
createColumns( numeroColonne , parent , viewer);
final Table table = viewer.getTable();
table.setLinesVisible(true);
viewer.setContentProvider(new ArrayContentProvider());
TableColumn column[] = new TableColumn[numeroColonne];
for(int k=0;k < numeroColonne;k++ )
{
column[k] = new TableColumn(table, SWT.NONE);
}
CounterExample oldRow = list.get(0);
CounterExample row;
for (int i = 0; i < list.size(); i++)
{
row = list.get(i);
TableItem item1 = new TableItem(table, SWT.NONE);
item1.setText(row.getRow());
for(int j=0; j < numeroColonne;j++)
{
if(j != 0 && i >= 2 && row.getVar(j).equals(oldRow.getVar(j))==false )
{
Display disp = Display.getCurrent();
Color yellow = disp.getSystemColor(SWT.COLOR_YELLOW);
item1.setBackground( j , yellow );
}
}
oldRow = row;
}
for(int k=0;k<numeroColonne;k++)
{
column[k].pack();
}
getSite().setSelectionProvider(viewer);
GridData gridData = new GridData();
gridData.verticalAlignment = GridData.FILL;
gridData.horizontalSpan = 2;
gridData.grabExcessHorizontalSpace = true;
gridData.grabExcessVerticalSpace = true;
gridData.horizontalAlignment = GridData.FILL;
viewer.getControl().setLayoutData(gridData);
}
当然它会导致插入和更新问题,但它会从根本上减少表大小。
答案 2 :(得分:0)
尝试以下查询:
SELECT *
FROM scores s
WHERE s.critic_id = 1
UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 2
UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 3
UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 4
UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 5
和(最终)仅在critic_id上添加索引。