我有一个包含每小时数据和值的简单表格。我想计算每个月每日最高值的平均值。 查询看起来很简单:
WITH daily_max AS
(
SELECT TRUNC(the_date, 'DD') as my_day, MAX(value) AS value
FROM my_data
GROUP by TRUNC(the_date, 'DD')
)
SELECT trunc(my_day, 'MM'), AVG(value)
FROM daily_max
GROUP BY trunc(my_day, 'MM')
order by 1
;
然而,我得到了很多"重复"在第一列(每天一个):
01/01/2017 00:00:00 95
01/01/2017 00:00:00 90
01/01/2017 00:00:00 99
01/01/2017 00:00:00 96
01/01/2017 00:00:00 94
01/01/2017 00:00:00 97
01/01/2017 00:00:00 96
01/01/2017 00:00:00 86
01/01/2017 00:00:00 98
01/01/2017 00:00:00 98
01/02/2017 00:00:00 97
01/02/2017 00:00:00 93
01/02/2017 00:00:00 100
01/02/2017 00:00:00 98
01/02/2017 00:00:00 94
01/02/2017 00:00:00 99
01/02/2017 00:00:00 94
01/02/2017 00:00:00 95
01/02/2017 00:00:00 99
第一个子查询按预期返回每日最大值。
我怀疑DATE数据类型有一种奇怪的行为,但即使我在日期使用TO_CHAR函数,我也有相同的行为。 GROUP BY语句中的表达式如何导致具有相同值的多行?
with daily_max AS
(
SELECT TRUNC(the_date, 'DD') as my_day, MAX(value) AS value
FROM my_data
GROUP by TRUNC(the_date, 'DD')
)
SELECT TO_CHAR(trunc(my_day, 'MM')), AVG(value)
FROM daily_max
GROUP BY TO_CHAR(trunc(my_day, 'MM'))
order by 1
;
为了增加我的困惑,当我在第一个子查询中将日期转换为时间戳时,结果就是我所期望的:
with daily_max AS
(
SELECT CAST(TRUNC(the_date , 'DD') AS timestamp) as my_day, MAX(value) AS value
FROM my_data
GROUP by TRUNC(the_date , 'DD')
)
SELECT trunc(my_day, 'MM') AS the_month, AVG(value)
FROM daily_max
GROUP BY trunc(my_day, 'MM')
order by 1
;
01/01/2017 00:00:00 94.9
01/02/2017 00:00:00 95.78571428571428571428571428571428571429
01/03/2017 00:00:00 95.38709677419354838709677419354838709677
01/04/2017 00:00:00 94.9
01/05/2017 00:00:00 95.32258064516129032258064516129032258065
01/06/2017 00:00:00 96.46666666666666666666666666666666666667
01/07/2017 00:00:00 96.16129032258064516129032258064516129032
01/08/2017 00:00:00 96.16129032258064516129032258064516129032
01/09/2017 00:00:00 96.13333333333333333333333333333333333333
01/10/2017 00:00:00 95.87096774193548387096774193548387096774
01/11/2017 00:00:00 97.3
01/12/2017 00:00:00 96.90322580645161290322580645161290322581
01/01/2018 00:00:00 96.43478260869565217391304347826086956522
我可能会想念一些愚蠢的东西,但有人可以向我解释这些行为吗?
查询以生成测试表:
CREATE TABLE my_data
AS
SELECT TRUNC (SYSDATE - ROWNUM/24, 'HH') as the_date, ROUND(DBMS_RANDOM.value(0,100),0) AS value
FROM DUAL
CONNECT BY ROWNUM < 366*24
;
答案 0 :(得分:1)
这似乎是bug 20537092;它可以在12.1.0.2(使用CTE或内联视图)中重现,但在11.2.0.4或12.2.0.1中可以重现。
该文件中的解决方法似乎解决了这个问题;设置
后运行示例<link href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.0.0-beta.2/css/bootstrap.css" rel="stylesheet" />
<div class="container">
<div class="card-deck mb-3 text-center">
<div class="card mb-4 box-shadow">
<div class="card-header">
<h4 class="my-0 font-weight-normal">Free</h4>
</div>
<div class="card-body">
<h1 class="card-title pricing-card-title">$0 <small class="text-muted">/ mo</small></h1>
<ul class="list-unstyled mt-3 mb-4">
<li>10 users included</li>
<li>2 GB of storage</li>
<li>Email support</li>
<li>Help center access</li>
<li>10 users included</li>
<li>2 GB of storage</li>
<li>Email support</li>
<li>Help center access</li>
</ul>
<button type="button" class="btn btn-lg btn-block btn-outline-primary">Sign up for free</button>
</div>
</div>
<div class="card mb-4 box-shadow">
<div class="card-header">
<h4 class="my-0 font-weight-normal">Enterprise</h4>
</div>
<div class="card-body">
<h1 class="card-title pricing-card-title">$29 <small class="text-muted">/ mo</small></h1>
<ul class="list-unstyled mt-3 mb-4">
<li>30 users included</li>
<li>15 GB of storage</li>
<li>Phone and email support</li>
<li>Help center access</li>
</ul>
<button type="button" class="btn btn-lg btn-block btn-primary">Contact us</button>
</div>
</div>
</div>
在以前没有的12.1会话中给出了明智的结果:
alter session set "_optimizer_aggr_groupby_elim"=false;
重写查询以避免嵌套的group-by可能更实际 - 取决于您当前的实际情况有多复杂,以及您是否可以修改相关会话或数据库初始化设置,或修补它。
对于您的(可能是简化的)示例,在没有应用变通方法的新会话中,使用distinct和分析版本替换内部聚合/分组似乎有效;它虽然有点难看,但对你的实际情况可能并不实用:
TRUNC(MY_DAY,'MM') AVG(VALUE)
------------------- ----------
2017-01-01 00:00:00 95.5
2017-02-01 00:00:00 95.6428571
2017-03-01 00:00:00 95.3225806
2017-04-01 00:00:00 95.6666667
2017-05-01 00:00:00 97.0322581
2017-06-01 00:00:00 95.7
2017-07-01 00:00:00 95.0967742
2017-08-01 00:00:00 96.1935484
2017-09-01 00:00:00 94.9333333
2017-10-01 00:00:00 96
2017-11-01 00:00:00 96.9333333
2017-12-01 00:00:00 95.3870968
2018-01-01 00:00:00 95.0434783
和往常一样,只是因为它看起来像这个错误并不意味着它一定是;您可能需要提出服务请求以获得确认,特别是在修补之前。
答案 1 :(得分:-1)
我无法解释你所看到的行为。没有CTE,您可以尝试以不同的方式编写逻辑:
SELECT TRUNC(my_day, 'MM'),
SUM(value) / COUNT(DISTINCT TRUNC(the_date, 'DD'))
FROM my_data
GROUP BY TRUNC(my_day, 'MM')
ORDER BY 1;
答案 2 :(得分:-1)
Pehaps trunc()不会返回日期...
WITH daily_max AS
(
SELECT to_date(TRUNC(the_date, 'DD')) as my_day, MAX(value) AS value
FROM jfl_test
group by TRUNC(the_date, 'DD')
)
SELECT trunc(my_day, 'MM'), AVG(value)
FROM daily_max
GROUP BY trunc(my_day, 'MM')
order by 1
;