Question

我是postgresql的新手，但我对mysql有很好的体验。我正在阅读文档，我发现postgresql有一个数组类型。我很困惑，因为我无法理解这种类型在rdbms中有用。为什么我必须选择这种类型而不是使用经典的一对多关系？

提前致谢。

Answer 1

我已经使用它们来更容易地使用树（例如注释线程）。您可以将树的根路径存储到数组中的单个节点，数组中的每个数字都是该节点的分支编号。然后，你可以这样做：

SELECT id, content
FROM nodes
WHERE tree = X
ORDER BY path -- The array is here.

PostgreSQL将以自然的方式逐元素地比较数组，因此ORDER BY path将以合理的线性显示顺序转储树;然后，检查path的长度以确定节点的深度，并给出缩进以使渲染正确。

上述方法可以让您从数据库到渲染页面，只需一次通过数据。

PostgreSQL也有geometric types，简单key/value types，并支持构建其他各种composite types。

通常最好使用传统的关联表，但在工具箱中使用更多工具并没有错。

Answer 2

一位SO用户正在将其用于看似machine-aided translation的内容。对follow-up question的评论可能有助于理解他的方法。

Answer 3

我一直在成功使用它们来使用触发器聚合递归树引用。

例如，假设您有一个类别树，并且您想要查找任何类别（1,2,3）或其任何子类别中的产品。

一种方法是使用丑陋的with recursive语句。这样做会输出一个计划，其中包含整个表上的合并/散列连接以及偶尔的实现。

with recursive categories as (
select id
from categories
where id in (1,2,3)
union all
...
)
select products.*
from products
join product2category on...
join categories on ...
group by products.id, ...
order by ... limit 10;

另一种方法是预先汇总所需的数据：

categories (
  id int,
  parents int[] -- (array_agg(parent_id) from parents) || id
)

products (
  id int,
  categories int[] -- array_agg(category_id) from product2category
)

index on categories using gin (parents)

index on products using gin (categories)

select products.*
from products
where categories && array(
      select id from categories where parents && array[1,2,3]
      )
order by ... limit 10;

上述方法的一个问题是＆amp;＆amp;和运营商是垃圾。（选择性是一个尚未编写的存根函数，无论聚合中的值如何，都会产生大约1/200行。）换句话说，你很可能最终得到一个seq扫描的索引扫描这是正确的。

为了解决这个问题，我增加了gin-indexed列的统计数据，并定期查看pg_stats以提取更合适的统计数据。粗略地看一下这些统计数据就会发现使用＆amp;＆amp;对于指定的值将返回一个不正确的计划，我重写适当的＆amp;＆amp;＆amp;使用arrayoverlap（）（后者的存根选择性为1/3），例如：

select products.*
from products
where arrayoverlap(cat_id, array(
      select id from categories where arrayoverlap(parents, array[1,2,3])
      ))
order by ... limit 10;

（同样适用于＆lt; @运算符......）

数组类型的用途是什么？

3 个答案: