Question

Oracle 10g Express Edition中有以下数据库架构： Image

我们的一个查询如下：

    select
        *
    from
        torder_item oi_0
    where
        oi_0.id in
        (
            select
                max(oi_1.id)
            from
                torder_item oi_1, torder o
            where
                oi_1.torder_id = o.id
            group by
                oi_1.tproduct_id
        )
        or oi_0.id in
        (
            select
                max(oi_2.id)
            from
                torder_item oi_2, tproduct p
            where
                oi_2.tproduct_id = p.id
            group
                by p.group_id
        );

问题是，查询运行速度很慢。我目前每个表中的行少于4000行，但查询执行时间在我的计算机上超过6秒。它是一个简化版本。如果我将'或'更改为'union'：

    select
        *
    from
        torder_item oi_0
    where
        oi_0.id in
        ((
            select
                max(oi_1.id)
            from
                torder_item oi_1, torder o
            where
                oi_1.torder_id = o.id
            group by
                oi_1.tproduct_id
        )
        union
        (
            select
                max(oi_2.id)
            from
                torder_item oi_2, tproduct p
            where
                oi_2.tproduct_id = p.id
            group
                by p.group_id
        ));

它会返回相同的结果，但会立即执行。不幸的是，我们正在使用Hibernate，它似乎不支持union，所以我不能像这样改变查询。这是原始查询的跟踪：

    call     count       cpu    elapsed       disk      query    current        rows
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    Parse        1      0.04       0.14          0         10          0           0
    Execute      1      0.00       0.00          0          0          0           0
    Fetch        8      6.19       6.19          0      31136          0          96
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    total       10      6.24       6.34          0      31146          0          96

    Misses in library cache during parse: 1
    Optimizer mode: ALL_ROWS
    Parsing user id: 5  

    Rows     Row Source Operation
    -------  ---------------------------------------------------
         96  FILTER  (cr=31136 pr=0 pw=0 time=14041 us)
       1111   TABLE ACCESS FULL TORDER_ITEM (cr=14 pr=0 pw=0 time=3349 us)
         96   FILTER  (cr=7777 pr=0 pw=0 time=1799577 us)
     102096    HASH GROUP BY (cr=7777 pr=0 pw=0 time=1584153 us)
    1234321     TABLE ACCESS FULL TORDER_ITEM (cr=7777 pr=0 pw=0 time=35809 us)
          0   FILTER  (cr=23345 pr=0 pw=0 time=4354068 us)
       5075    HASH GROUP BY (cr=23345 pr=0 pw=0 time=4250913 us)
    1127665     HASH JOIN  (cr=23345 pr=0 pw=0 time=2716544 us)
    1127665      TABLE ACCESS FULL TORDER_ITEM (cr=7105 pr=0 pw=0 time=38500 us)
    3818430      TABLE ACCESS FULL TPRODUCT (cr=16240 pr=0 pw=0 time=22423 us)

我尝试添加索引并对表执行分析，但它没有帮助。

有谁知道为什么它如此缓慢以及如何改进它？

Here is the test data if anyone wants to reproduce the problem.

Answer 1

您已经找到了解决性能问题的方法。您可以使用视图并从休眠中查询该视图。

Answer 2

我不知道Hibernate是否支持这种类型的EXISTS查询，但这里是如何编写的：

select
    *
from
    torder_item oi_0
where
    EXISTS
    (
        select
            *
        from
            torder_item oi_1, torder o
        where
            oi_1.torder_id = o.id
        group by
            oi_1.tproduct_id
        having
            oi_0.id = max(oi_1.id)
    )
    or EXISTS
    (
        select
            *
        from
            torder_item oi_2, tproduct p
        where
            oi_2.tproduct_id = p.id
        group
            by p.group_id
        having
            oi_0.id = max(oi_2.id)
    );

Answer 3

根据我在您的问题下面的评论，我认为这两个查询都相当于：

select
    *
from
    torder_item oi_0
where
    oi_0.id in
    (
        select
            max(oi_1.id)
        from
            torder_item oi_1
        group by
            oi_1.tproduct_id
    )

但是，我知道问题中给出的查询是简化的，而实际查询可能不是这样。

Answer 4

为什么这么慢？

因为对于TORDER_ITEM的每一行，Oracle执行第一个子查询，然后 - 如果子查询结果中没有出现oi_0.id - 则执行第二个子查询。这就是为什么你在计划输出的“行”列中看到如此大的数字（例如3818430意味着TPRODUCT表，其中有3762行，已完全扫描1015次）

在联合的情况下，执行计划是不同的：首先执行两个子查询，结果（96个唯一ID）保存在内存中，这个结果Oracle访问TORDER_ITEM的每一行 - 所以实际上每个子查询都执行了一次而不是1000。

不要问我为什么优化器在第一次查询的情况下不够聪明，不能做类似的事情。

我希望Hibernate支持外连接。我的命题是将TORDER_ITEM连接到第一个子查询，然后连接第二个子查询，并过滤那些在第一个或第二个子查询中有某些内容的行。我的意思是

SELECT oi_0.*
  FROM torder_item oi_0
  LEFT JOIN (SELECT MAX(oi_1.id) id
               FROM torder_item oi_1
          /* you don't need the join with torder here, it isn't used anyway */
              GROUP BY oi_1.tproduct_id
            ) subquery1 ON subquery1.id = oi_0.id
  LEFT JOIN  (SELECT MAX(oi_2.id) id
                FROM torder_item oi_2,
                     tproduct p
               WHERE oi_2.tproduct_id = p.id
               GROUP BY p.group_id
             ) subquery2 ON subquery2.id = oi_0.id
 WHERE subquery1.id IS NOT NULL OR subquery2.id IS NOT NULL

Oracle查询性能问题

4 个答案: