从MySQL查询计算百分比

时间:2017-04-27 13:11:23

标签: mysql

我试图找到畅销书排名前5位

我计算前5本畅销书的想法是这样的:

percentage = number_of_SUCCESS_transactions_each_book / total_number_transactions_each_book

Fetch the result(book_id, percentage) sorted in DESC order, with a LIMIT of 5

以下是为了理解而包含数据的表格的简单表示:

tblPayments
-----------
trans_id | book_id | payment_status | purchase_date
---------------------------------------------------
1   |   233 | SUCCESS   | 2017-04-05
2   |   145 | FAILED    | 2017-04-10
3   |   233 | FAILED    | 2017-04-05
4   |   233 | SUCCESS   | 2017-04-05


tblBooks
--------
book_id | book_name
-------------------
233 | My Autobiography
145 | How to learn English
201 | Finding Nemo

我将在特定日期之间查询前5名畅销书籍。例如,在2017-04-012017-04-25

之间

我期待的输出是这样的:

book_id | book_name  | percentage
----------------------------------
233 | My Autobiography      | 67
145 | How to learn English  | 0
201 | Finding Nemo          | 0

经过几个小时的头脑风暴后,我们正在考虑这个问题:

SELECT b.`book_id`, ( 
    (   
        ( SELECT COUNT(*) FROM `tblPayments` WHERE `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' ) / 
        ( SELECT COUNT(*) FROM `tblPayments` WHERE `book_id` = b.`book_id` ) 
    ) * 100.0 ) AS `percentage` 
FROM `tblPayments` AS b 
WHERE b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25' 
GROUP BY b.`book_id` 
ORDER BY `percentage` DESC LIMIT 5

可以进一步改进吗?它会导致数据库出现任何性能问题吗?

现在我坐火车回到我家。所以我是从平板电脑中写出来的。当我在大约6小时后回到家时,我可以测试它。所以我想在这里同时问它。

或者你对比这更好的方法有什么建议吗?

谢谢

修改

感谢@Strawberry和@Stefano Zanini的答案。

还有一个疑问。如果JOIN只用tblBooks来获取结果集中的book_name字段,那会没关系吗?

我的意思是,这个tblPayments表应该有很多行。那么JOIN会好吗?或者我应该在PHP中获得这5行并进行另一个查询以获得这5本书中每一本的book_name?什么是有效的方法?

2 个答案:

答案 0 :(得分:0)

atexit()

我遗漏了那些微不足道的东西。

答案 1 :(得分:0)

您可以通过将用于百分比的内部查询替换为条件总和来改进该查询:

SELECT b.`book_id`,
       SUM(case when `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' then 1 end) /
       COUNT(*) * 100.0 AS `percentage` 
FROM   `tblPayments` AS b 
WHERE  b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25' 
GROUP BY b.`book_id` 
ORDER BY `percentage` DESC
LIMIT 5

修改

解决您的新问题:不需要执行PHP中介的第二个查询,您可以在单个查询中执行所有操作:

select  t1.book_id, t2.book_name, t1.percentage
from    (
            SELECT b.`book_id`,
                   SUM(case when `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' then 1 end) /
                   COUNT(*) * 100.0 AS `percentage` 
            FROM   `tblPayments` AS b 
            WHERE  b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25' 
            GROUP BY b.`book_id` 
            ORDER BY `percentage` DESC
            LIMIT 5
        ) t1
join    tblBooks t2
on      t1.book_id = t2.book_id

这可能比在第一个查询中加入tblBooks更快

SELECT b.`book_id`,
       c.`book_name`,
       SUM(case when `book_id` = b.`book_id` AND `payment_status` = 'SUCCESS' then 1 end) /
       COUNT(*) * 100.0 AS `percentage` 
FROM   `tblPayments` AS b
JOIN   `tblBooks` AS c
ON     b.`book_id` = c.`book_id`
WHERE  b.`purchase_date` BETWEEN '2017-04-01' AND '2017-04-25' 
GROUP BY b.`book_id` 
ORDER BY `percentage` DESC
LIMIT 5

但如果我是你,我会自己做一些测试,看看表演是否真的是一个问题,在这种情况下哪个查询更快。