我一直致力于编写一个查询来访问我们公司非常大的数据库,以便为客户提取最大的结算金额(在本例中为A和B)。我们希望撤回过去一个月每位客户的最高A / B和过去一年的最高A / B.
我们在结算数据库中注意到的一个问题是它存储“已取消”账单的方式。它通过将第一个记帐记录的第二个负面版本添加到记帐表中来实现。像这样:
在这种情况下,41040是不正确的账单,因此添加了负面版本的记录。但是,当我尝试在此列上选择最大值时,我仍然会返回41040而不是正确的结算值50.此表似乎没有以任何方式标记这些不正确的账单过滤掉。
我目前的解决方案是将ID列的最大值作为正确的帐单。这假设一个月输入的最终账单是正确的。
这似乎带回了正确的数据,但查询在大型数据集上的运行速度非常慢,而且我没有对此表的写访问权来添加或查看索引。总共有98,007,807行和1,596,491个独特客户,是否有优化查询以提高性能?
select mth.KY_CUSTOMER_NO,max(QY_MTH_BILLED_A) as QY_MTH_BILLED_A, max(QY_MTH_B) as QY_MTH_BILLING_B, max.MAX_BILLING_A, max.MAX_BILLING_B
from (
--Get the max A/B values for the past month
select m.*
from CUSTOMER_USAGE m
where rev_year = to_number(to_char(sysdate,'yyyy'))
and rev_mth in (to_number(to_char(add_months(sysdate, -1), 'mm')),to_number(to_char(sysdate,'mm')))
and ID in (select max(ID) from CUSTOMER_USAGE where KY_CUSTOMER_NO = m.KY_CUSTOMER_NO group by rev_mth, rev_year)
) mth join
(
--Get the max A/B values for the past year
select KY_CUSTOMER_NO, max(QY_MTH_B) as MAX_BILLING_B, max(QY_MTH_BILLED_A) as MAX_BILLING_A from CUSTOMER_USAGE m
where DT_ADDED > current_timestamp - 365 ID in (select max(ID) from CUSTOMER_USAGE where KY_CUSTOMER_NO = m.KY_CUSTOMER_NO group by rev_mth, rev_year)
group by KY_CUSTOMER_NO
) max on mth.KY_CUSTOMER_NO = max.KY_CUSTOMER_NO
group by mth.KY_CUSTOMER_NO, max.MAX_BILLING_KVA, max.MAX_BILLING_KW
答案 0 :(得分:1)
分析功能似乎是解决方案。
我遗漏了WHERE
条款,因为您的示例数据不需要这些条款,但您应该能够将它们添加回最内层的内联视图中。您也可以使用EXTRACT( YEAR FROM SYSDATE )
而不是转换为字符串和从字符串转换。
Oracle安装程序:
CREATE TABLE customer_usage ( id, ky_customer_no, rev_mth, rev_year, qy_mth_billed_a, qy_mth_billed_b ) AS
SELECT 1, 1, 1, 2016, 41040, 0 FROM DUAL UNION ALL
SELECT 2, 1, 1, 2016, -41040, 0 FROM DUAL UNION ALL
SELECT 3, 1, 1, 2016, 50, 0 FROM DUAL UNION ALL
SELECT 4, 1, 1, 2016, 0, 0 FROM DUAL;
<强>查询强>:
SELECT id,
ky_customer_no,
rev_mth,
rev_year,
qy_mth_billed_a,
qy_mth_billed_b
FROM (
SELECT c.*,
ROW_NUMBER()
OVER ( PARTITION BY ky_customer_no, rev_year, rev_mth
ORDER BY total_mth_billed_a DESC ) AS rn
FROM (
SELECT c.*,
SUM( qy_mth_billed_a )
OVER ( PARTITION BY ky_customer_no, rev_year, rev_mth, ABS( qy_mth_billed_a )
ORDER BY id DESC ) AS total_mth_billed_a
FROM customer_usage c
) c
)
WHERE rn = 1;
<强>输出强>:
ID KY_CUSTOMER_NO REV_MTH REV_YEAR QY_MTH_BILLED_A QY_MTH_BILLED_B
---------- -------------- ---------- ---------- --------------- ---------------
3 1 1 2016 50 0
答案 1 :(得分:0)
我尝试了另一种方法,但使用了大部分@ MT0设置。
CREATE TABLE customer_usage ( id, ky_customer_no, rev_mth, rev_year, qy_mth_billed_a, qy_mth_billed_b ) AS
SELECT 1, 1, 1, 2016, 41040, 0 FROM DUAL UNION ALL
SELECT 2, 1, 1, 2016, -41040, 0 FROM DUAL UNION ALL
SELECT 3, 1, 1, 2016, 50, 0 FROM DUAL UNION ALL
SELECT 4, 1, 1, 2016, 0, 0 FROM DUAL;
由于我们想要摆脱那些ABS()相等但有不同符号的值,我试过这个:
SELECT c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR, max(qy_mth_billed_a) as qy_mth_billed_a , max(QY_MTH_BILLED_B) as qy_mth_billed_b
FROM (
SELECT c.*,
max( qy_mth_billed_a )
OVER ( PARTITION BY ky_customer_no, rev_year, rev_mth,ABS( qy_mth_billed_a )) AS max_mth_billed_a,
min( qy_mth_billed_a )
OVER ( PARTITION BY ky_customer_no, rev_year, rev_mth,ABS( qy_mth_billed_a )) AS min_mth_billed_a
FROM customer_usage c
) c where max_mth_billed_a+min_mth_billed_a!=0
group by c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR;
输出相同,并且由于您遇到了一些性能问题,我尝试了两种方法:
KY_CUSTOMER_NO REV_MTH REV_YEAR qy_mth_billed_a qy_mth_billed_b
1 1 1 2016 50 0
修改强> 实际上,如果你计算每个abs(值)的不同符号并且它是一个奇数,我认为它会更快地工作(只需要一个窗函数)
SELECT c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR, max(qy_mth_billed_a) as qy_mth_billed_a , max(QY_MTH_BILLED_B) as qy_mth_billed_b
FROM (
SELECT c.*,
count(sign( qy_mth_billed_a ))
OVER ( PARTITION BY ky_customer_no, rev_year, rev_mth,ABS( qy_mth_billed_a )) AS signo
FROM customer_usage c
) c where mod(signo,2) =1
group by c.KY_CUSTOMER_NO, c.REV_MTH, c.REV_YEAR