我试图找到畅销书排名前5位。
我计算前5本畅销书的想法是这样的:
percentage = number_of_SUCCESS_transactions_each_book / total_number_transactions_each_book
Fetch the result(book_id, percentage) sorted in DESC order, with a LIMIT of 5
以下是为了理解而包含数据的表格的简单表示:
tblPayments
-----------
trans_id | book_id | payment_status | purchase_date
---------------------------------------------------
1 | 233 | SUCCESS | 2017-04-05
2 | 145 | FAILED | 2017-04-10
3 | 233 | FAILED | 2017-04-05
4 | 233 | SUCCESS | 2017-04-05
tblBooks
--------
book_id | book_name
-------------------
233 | My Autobiography
145 | How to learn English
201 | Finding Nemo
我将在特定日期之间查询前5名畅销书籍。例如,在2017-04-01
到2017-04-25
我期待的输出是这样的:
book_id | book_name | percentage
----------------------------------
233 | My Autobiography | 67
145 | How to learn English | 0
201 | Finding Nemo | 0
经过几个小时的头脑风暴后,我们正在考虑这个问题:
SELECT b.`book_id`, (
(
( SELECT COUNT(*) FROM `tblPayments` WHERE `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' ) /
( SELECT COUNT(*) FROM `tblPayments` WHERE `book_id` = b.`book_id` )
) * 100.0 ) AS `percentage`
FROM `tblPayments` AS b
WHERE b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25'
GROUP BY b.`book_id`
ORDER BY `percentage` DESC LIMIT 5
可以进一步改进吗?它会导致数据库出现任何性能问题吗?
现在我坐火车回到我家。所以我是从平板电脑中写出来的。当我在大约6小时后回到家时,我可以测试它。所以我想在这里同时问它。
或者你对比这更好的方法有什么建议吗?
谢谢
修改
感谢@Strawberry和@Stefano Zanini的答案。
还有一个疑问。如果JOIN
只用tblBooks
来获取结果集中的book_name
字段,那会没关系吗?
我的意思是,这个tblPayments
表应该有很多行。那么JOIN
会好吗?或者我应该在PHP中获得这5行并进行另一个查询以获得这5本书中每一本的book_name
?什么是有效的方法?
答案 0 :(得分:0)
atexit()
我遗漏了那些微不足道的东西。
答案 1 :(得分:0)
您可以通过将用于百分比的内部查询替换为条件总和来改进该查询:
SELECT b.`book_id`,
SUM(case when `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' then 1 end) /
COUNT(*) * 100.0 AS `percentage`
FROM `tblPayments` AS b
WHERE b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25'
GROUP BY b.`book_id`
ORDER BY `percentage` DESC
LIMIT 5
修改强>
解决您的新问题:不需要执行PHP中介的第二个查询,您可以在单个查询中执行所有操作:
select t1.book_id, t2.book_name, t1.percentage
from (
SELECT b.`book_id`,
SUM(case when `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' then 1 end) /
COUNT(*) * 100.0 AS `percentage`
FROM `tblPayments` AS b
WHERE b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25'
GROUP BY b.`book_id`
ORDER BY `percentage` DESC
LIMIT 5
) t1
join tblBooks t2
on t1.book_id = t2.book_id
这可能比在第一个查询中加入tblBooks
更快
SELECT b.`book_id`,
c.`book_name`,
SUM(case when `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' then 1 end) /
COUNT(*) * 100.0 AS `percentage`
FROM `tblPayments` AS b
JOIN `tblBooks` AS c
ON b.`book_id` = c.`book_id`
WHERE b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25'
GROUP BY b.`book_id`
ORDER BY `percentage` DESC
LIMIT 5
但如果我是你,我会自己做一些测试,看看表演是否真的是一个问题,在这种情况下哪个查询更快。