PostgreSQL基于大型列表的分区未使用

时间:2018-06-12 14:58:31

标签: postgresql partitioning postgresql-10

我有一个大的分区表。分区键被定义为一长串不连续ID。 E.g:

Partition of: foo_partitioned FOR VALUES IN (1733, 1731, 1800, 1732, 1799, 1798, 1804, 1803, 1802, 1801, 1797, 1796, 1795, 1794, 1793, 1792, 1791, 1790, 1789, 1788, 1787, 1786, 
1785, 1784, 1783, 1715, 1714, 1713, 1712, 1711, 1710, 1709, 1708, 1707, 1706, 1705, 1704, 1703, 1702, 1701, 1700, 1699, 1698, 1697, 1696, 1695, 1694, 1693, 1692, 1691, 1689, 
1688, 1687, 1686, 1685, 1684, 1683, 1682, 1681, 1680, 1679, 1658, 1657, 1656, 1655, 1654, 1653, 1652, 1651, 1650, 1649, 1648, 1647, 1646, 1645, 1644, 1643, 1642, 1641, 1640, 
1639, 1638, 1637, 1636, 1635, 1634, 1633, 1632, 1631, 1630, 1629, 1628, 1627, 1626, 1625, 1624, 1623, 1622, 1581, 1580, 1579, 1578, 1577, 1569, 1568, 1567, 1547, 1546, 1545, 
1549, 1550, 1551, 1552, 1553, 1554, 1555, 1556, 1557, 1808, 1809, 1810, 1811, 1888, 1819, 1820, 1821, 1822, 1823, 1824, 1825, 1826, 1827, 1828, 1829, 1830, 1831, 1832, 1833, 
1834, 1835, 1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1846, 1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855, 1856, 1857, 1858, 1859, 1860, 1861, 1862, 
1863, 1864, 1865, 1866, 1867, 1868, 1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1889, 1890, 1891, 1892, 
1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 
1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 
1954, 1955, 1956, 1957, 1959, 1960, 1961)

Partition constraint: ((bar_id IS NOT NULL) AND (bar_id = ANY (ARRAY[1733, 1731, 1800, 1732, 1799, 1798, 1804, 1803, 1802, 1801, 1797, 1796, 1795, 1794, 1793, 1792, 1791, 1790, 
1789, 1788, 1787, 1786, 1785, 1784, 1783, 1715, 1714, 1713, 1712, 1711, 1710, 1709, 1708, 1707, 1706, 1705, 1704, 1703, 1702, 1701, 1700, 1699, 1698, 1697, 1696, 1695, 1694, 
1693, 1692, 1691, 1689, 1688, 1687, 1686, 1685, 1684, 1683, 1682, 1681, 1680, 1679, 1658, 1657, 1656, 1655, 1654, 1653, 1652, 1651, 1650, 1649, 1648, 1647, 1646, 1645, 1644, 
1643, 1642, 1641, 1640, 1639, 1638, 1637, 1636, 1635, 1634, 1633, 1632, 1631, 1630, 1629, 1628, 1627, 1626, 1625, 1624, 1623, 1622, 1581, 1580, 1579, 1578, 1577, 1569, 1568, 
1567, 1547, 1546, 1545, 1549, 1550, 1551, 1552, 1553, 1554, 1555, 1556, 1557, 1808, 1809, 1810, 1811, 1888, 1819, 1820, 1821, 1822, 1823, 1824, 1825, 1826, 1827, 1828, 1829, 
1830, 1831, 1832, 1833, 1834, 1835, 1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1846, 1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855, 1856, 1857, 1858, 
1859, 1860, 1861, 1862, 1863, 1864, 1865, 1866, 1867, 1868, 1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 
1889, 1890, 1891, 1892, 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 
1918, 1919, 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 
1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1959, 1960, 1961])))

使用IN子句运行针对与此确切值列表匹配的查询的解释时,计划程序会尝试扫描所有分区。即使做一些简单的事情,比如看一个值:

select * from foo_partitioned where bar_id=1733;

查询计划程序检查每个分区。

Append  (cost=102.95..365378.32 rows=378224 width=36)
   ->  Bitmap Heap Scan on foo_partitioned_8  (cost=102.95..3508.07 rows=4571 width=36)
         Recheck Cond: (bar_id = 1732)
         ->  Bitmap Index Scan on uq_foo_partitioned_8  (cost=0.00..101.81 rows=4571 width=0)
               Index Cond: (bar_id = 1732)
   ->  Index Scan using uq_foo_partitioned_16 on foo_partitioned_16  (cost=0.56..55328.46 rows=62937 width=36)
         Index Cond: (bar_id = 1732)
   ->  Bitmap Heap Scan on foo_partitioned_25  (cost=1.30..6.52 rows=6 width=36)
         Recheck Cond: (bar_id = 1732)
         ->  Bitmap Index Scan on uq_foo_partitioned_25  (cost=0.00..1.30 rows=6 width=0)
               Index Cond: (bar_id = 1732)
   ->  Bitmap Heap Scan on foo_partitioned_26  (cost=1.30..6.52 rows=6 width=36)
         Recheck Cond: (bar_id = 1732)
         ->  Bitmap Index Scan on uq_foo_partitioned_26  (cost=0.00..1.30 rows=6 width=0)
               Index Cond: (bar_id = 1732)
   ->  Bitmap Heap Scan on foo_partitioned_27  (cost=1.30..6.52 rows=6 width=36)
         Recheck Cond: (bar_id = 1732)
         ->  Bitmap Index Scan on uq_foo_partitioned_27  (cost=0.00..1.30 rows=6 width=0)
               Index Cond: (bar_id = 1732)
   ->  Bitmap Heap Scan on foo_partitioned_28  (cost=1.30..6.52 rows=6 width=36)
         Recheck Cond: (bar_id = 1732)
         ->  Bitmap Index Scan on uq_foo_partitioned_28  (cost=0.00..1.30 rows=6 width=0)
               Index Cond: (bar_id = 1732)
   ->  Bitmap Heap Scan on foo_partitioned_29  (cost=347.37..14554.48 rows=15528 width=36)
         Recheck Cond: (bar_id = 1732)
...

还有更好的方法吗? a)定义表,以便实际使用约束排除(我将其设置为分区)或 b)编写查询以便实际使用约束排除?

我试过看code了一下,但它似乎超出了我的范围。

修改

我花了一些时间尝试替代解决方案。我创建了一个辅助密钥作为潜在的主键,强制所有条形连续并且在每个分区的已知范围内。然后,我使用远程分区创建了第二个可比较的分区表来进行一些测试。

我发现的一些事情是:

  1. 在where(WHERE 50000 <= bar_id AND 50500 < bar_id)中使用一个或两个范围会触发约束排除
  2. 使用大* * where“list”,无论是纯IN子句还是IN (VALUES...)都不会触发约束排除
  3. *但是,在IN子句中使用100个或更少值的列表会触发约束排除。不仅如此,还有其他一些东西在PostgreSQL的后端查询规划器中启动。
  4. 我还尝试过的其他事情是以100个小组为一组进行批量处理的联合查询块,这会触发每个子集查询的约束排除,但不如使用X&lt;吧&lt; Y系列。您的里程可能会有所不同。

    话虽如此,两个版本都比以前快10倍,因此任务在某种意义上完成。我计划使用我的应用程序逻辑创建智能范围,而不是使用100块,也许除非我有非常不同的bar_id(我预见不太可能,TBD)。

    希望这有助于其他人使用PG分区。

    如果有人在PG工作,可以评论先前的'list'版本和列表&gt; 100的范围内发生的事情,那将非常有用和有趣。

    请注意,我假设 - 对于基于列表的分区 - 在分区 definitions 中使用100或更少的列表可能允许引入约束,作为阻止它的逻辑我所做的范围分区测试也可能由分区修剪逻辑共享,无论如何,如果列表定义大于100项,它就会放弃。

0 个答案:

没有答案