我正在尝试制定一个查询,该查询将查找在指定时间段内没有交易的所有用户。我的问题是我的查询就像在嵌套循环中被捕获一样。我试图找出我的逻辑存在缺陷的地方。
我不能给出实际的查询,因为它是为了工作,但这是一个使用类似结构的示例。 (是的,余额/交易数据分布在两对表中......这是我必须使用的)
鉴于架构:
Users Balances_A Transactions_A
user_id account_id <-\ transaction_id
ssn <------+-- ssn \-- account_id
occupation | balance amount
name | type trdate
address | department
|
|
| Balances_B Transactions_B
| account_id <-\ transaction_id
+-- ssn \-- account_id
balance amount
code (*) trdate
department
* same as type, just different field name.
Note: each "<---" indicates a 1 to many relationship
任务:找到所有拥有type ='A',department ='1'帐户的用户,其当前余额为0.00,并且在去年内有一个类型'A'的交易。还需要了解所有 其他 类型的交易的余额,但不包括类型“X”和“Y”。
参数:department ='1',type ='A',交易日期&lt;一年前,余额= 0.00
这是我试过的:
SELECT
u.user_id, u.name, u.address, u.ssn,
account_balances_a.other_balance,
account_balances_b.other_balance,
last_transaction_a.last_transaction_date,
last_transaction_b.last_transaction_date
FROM users AS u
-- attach other balance total from A
LEFT JOIN ( SELECT SUM(balance) as other_balance
FROM balances_a as bal_a
WHERE bal_a.type NOT IN ('A','X','Y') AND bal_a.department='1'
GROUP BY bal_a.ssn
) AS account_balances_a
ON u.ssn = account_balances_a.ssn
-- attach other balance total from B
LEFT JOIN ( SELECT SUM(balance) as other_balance
FROM balances_b as bal_b
WHERE bal_b.code NOT IN ('A','X','Y') AND bal_b.department='1'
GROUP BY bal_b.ssn
) AS account_balances_b
ON u.ssn = account_balances_b.ssn
-- regular join balance A table
, balances_a AS ba
-- attach last transaction date ( transactions A )
LEFT JOIN ( SELECT MAX(temp1.trdate) as last_transaction_date
FROM transactions_a as temp1
GROUP BY temp1.account_id
) AS last_transaction_a
ON temp1.account_id = ba.account_id
-- regular join balance B table
, balances_b AS bb
-- attach last transaction date ( transactions B )
LEFT JOIN ( SELECT MAX(temp2.trdate) as last_transaction_date
FROM transactions_b as temp2
GROUP BY temp2.account_id
) AS last_transaction_b
ON temp2.account_id = bb.account_id
WHERE
u.occupation='ditch digger'
-- user has an account type 'A' with department '1' in the specified time frame:
AND (
-- either in Balance A table,
( u.ssn=ba.ssn AND ba.balance=0.00 AND ba.type='A' AND ba.department='1' and last_transaction_a.last_transaction_date>'$one_year-ago' )
OR
-- or in Balance B table
( u.ssn=bb.ssn AND bb.balance=0.00 AND bb.code='A' AND bb.department='1' and last_transaction_b.last_transaction_date>'$one_year-ago' )
)
ORDER BY last_transaction_a.last_transaction_date
问题似乎出现在WHERE子句中;如果我注释掉“...在平衡表A中”或“在平衡B表中”,则查询有效。但两者兼而有之,它正试图订购数百万条记录。
把它拿出来之后,我想我明白为什么会失败;但是如果你花时间和我一起思考并且可以很好地解释它失败的原因,我将不胜感激。
答案 0 :(得分:1)
因为在联合用户之前必须首先加入balancea和transactiona。否则你做2交叉连接(真的是低性能,因为扫描所有表格乘以所有行=&gt;你在你的where子句中使用OR)
尝试像这样修改您的查询
SELECT
u.user_id, u.name, u.address, u.ssn,
account_balances_a.other_balance,
account_balances_b.other_balance,
last_transaction_a.last_transaction_date,
last_transaction_b.last_transaction_date
FROM users AS u
LEFT OUTER JOIN LATERAL
(
SELECT SUM(bal_a.balance) as other_balance FROM balances_a as bal_a
WHERE bal_a.department='1' and u.ssn = bal_a_a.ssn and bal_a.type NOT IN ('A','X','Y')
) account_balances_a on 1=1
LEFT OUTER JOIN LATERAL
(
SELECT SUM(bal_b.balance) as other_balance FROM balances_b as bal_b
WHERE bal_b.department='1' and u.ssn = bal_b.ssn and bal_b.type NOT IN ('A','X','Y')
) account_balances_b on 1=1
LEFT OUTER JOIN LATERAL
(
SELECT MAX(temp1.trdate) as last_transaction_date
FROM transactions_a as temp1 inner join balances_a ba on temp1.account_id = ba.account_id
WHERE u.ssn = ba.ssn and ba.type='A' and ba.balance=0.00 and ba.department='1'
) last_transaction_a on last_transaction_date>current date - 1 year
LEFT OUTER JOIN LATERAL
(
SELECT MAX(temp2.trdate) as last_transaction_date
FROM transactions_b as temp2 inner join balances_b bb on temp2.account_id = bb.account_id
where u.ssn=bb.ssn AND bb.code='A' AND bb.balance=0.00 AND bb.department='1'
) last_transaction_b on last_transaction_date>current date - 1 year
WHERE u.occupation='ditch digger'
AND (last_transaction_a.last_transaction_date is not null or last_transaction_b.last_transaction_date is not null)
ORDER BY last_transaction_a.last_transaction_date
答案 1 :(得分:0)
感谢Esperento57的回答,我重新研究了“交叉连接”。我忘记了多个(逗号分隔)表的查询开始交叉连接;我基本上在from子句中交叉连接了3个表。 (至少,这是我的意图。)所以由where子句正确加入它们。
......很明显它没有做到。
在我看来,所有表都与users.ssn
绑在一起。因此,它会遍历用户(在balances
a
&amp; b
中将各种过滤器与其绑定),一切都应该正常。
[尤里卡时刻]
...然后它遍历balances_a
,一切都非常糟糕。 where子句甚至没有像我想象的那样接近表格。 OR
导致balances_a和用户之间的交叉连接。
如果这还不够糟糕,那么它会用balances_b
重新开始整个事情。
这引导我找到我想要的故障排除概念。无论这是数据库实际的工作方式,您似乎可以将每个以逗号分隔的表视为迭代其所有行。 (即交叉连接) where子句必须适用于每个以逗号分隔的表的迭代。
由于这个查询是一个非常糟糕的失败,我重新开始并发现在(过滤的)余额上进行联合然后离开加入用户以及总和(余额)和最大值(日期)要好得多。