考虑以下MySQL表架构:
id int,
amount decimal,
transaction_no,
location_id int,
created_at datetime
以上架构用于存储餐馆的POS收据。获取每日收据和收据的报告他们的总和。试过以下查询:
SELECT location_id,count(distinct(transaction_no)) as count,sum(amount) as receipt_amount FROM `receipts` WHERE date(`receipts`.`created_at`) = '2015-05-17' GROUP BY `receipts`.`location_id`
但问题是,每次金额可能/可能不同时,具有相同交易编号的收据会重复多次。处理此问题的业务规则是我们收到的最后一张收据是最新收据。所以上面的查询不起作用。
我要做的是以下内容:
[编辑]
这是查询计划:
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 25814155
filtered: 100.00
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: r
type: ref
possible_keys: punchh_key_location_id_created_at
key: punchh_key_location_id_created_at
key_len: 50
ref: t.punchh_key
rows: 1
filtered: 100.00
Extra: Using index condition; Using where
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: r
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 25814155
filtered: 100.00
Extra: Using temporary; Using filesort
3 rows in set, 1 warning (0.00 sec)
答案 0 :(得分:1)
您也可以使用distinct
中修改的sum
:
SELECT location_id,
COUNT(DISTINCT transaction_no) AS cnt,
SUM(DISTINCT amount) AS receipt_amount
FROM `receipts`
WHERE DATE(`receipts`.`created_at`) = '2015-05-17'
GROUP BY `receipts`.`location_id`
答案 1 :(得分:1)
您可以通过加入确定当天每个created_at
的最后created_at
的内联视图,在同一天内仅计算最后transaction_no
值的金额。
这可以避免简单地使用sum(distinct ...
,因为否则两个具有相同金额的不同交易(如果存在)将只计算一次。
这种方法应该避免这个问题。
select r.location_id,
count(*) as num_transactions,
sum(r.amount) as receipt_amount
from receipts r
join (
select transaction_no,
max(created_at) as last_created_at_for_trans
from receipts
where created_at like '2015-05-17%'
group by transaction_no
) v
on r.transaction_no = v.transaction_no
and r.created_at = v.last_created_at_for_trans
where r.created_at like '2015-05-17%'
group by r.location_id
另一种方法是使用not exists
,您可能希望测试哪个提供更好的性能:
select r.location_id,
count(*) as num_transactions,
sum(r.amount) as receipt_amount
from receipts r
where r.created_at like '2015-05-17%'
and not exists ( select 1
from receipts x
where x.transaction_no = r.transaction_no
and x.created_at > r.created_at
)
group by r.location_id
答案 2 :(得分:1)
如何计算在多个天重复的交易?
我认为你实际上不想要计算交易,只是因为它是当天的最后一个,如果第二天有另一张收据。您可以通过多种方式获取每笔交易的最终记录。一种典型的方法是使用group by
(这类似于Brian的查询,但略有不同):
select r.*
from receipts r join
(select transaction_no, max(created_at) as maxca
from receipts r
group by transaction_no
) t
on r.transaction_no = t.transaction_no and r.created_at = t.maxca;
完整的查询是:
select location_id, count(*) as numtransactions, sum(amount) as receipt_amount
from receipts r join
(select transaction_no, max(created_at) as maxca
from receipts r
group by transaction_no
) t
on r.transaction_no = t.transaction_no and r.created_at = t.maxca;
where r.created_at >= date('2015-05-17') and r.created_at < date('2015-05-18')
group by location_id;
注意日期比较。
date(r.created_at) = '2015-05-17'
的原始形式在逻辑上是正确的。但是,date()
的使用意味着不能使用索引。与常量进行两次比较的表单将允许查询利用receipts(created_at)
上的索引。
不鼓励使用like
日期。这需要将日期隐式地转换为字符串,然后将其作为字符串进行比较。这有不必要的转换,在某些数据库中,语义依赖于全球化设置。