我想了解数据库中按帐号ID分组的订单之间的平均天数。
我们假设我的下表名为' orders'有了这些数据。
id account_id account_name order_date
1 555 Acme Fireworks 2015-06-15
2 342 Kent Brewery 2015-09-12
3 555 Acme Fireworks 2015-09-15
4 342 Kent Brewery 2015-10-12
5 342 Kent Brewery 2015-11-12
6 342 Kent Brewery 2015-12-12
7 555 Acme Fireworks 2015-12-15
8 900 Plastic Inc. 2015-12-20
我想要一个查询来产生以下结果
account_id account_name average_days_between_orders
342 Kent Brewery 30.333
555 Acme Fireworks 91.5
900 Plastic Inc. (unsure of what value would go here since there's 1 order only)
我检查了以下问题以获得一个想法,但仍然无法找出问题所在:
谢谢!
答案 0 :(得分:0)
您需要一个查询来生成给定的先前购买之间的差异(如果之前没有购买,则为null)并获取这些值的平均值。
我会自行加入上表,为每个订单获取子查询中任何先前订单的最大订单日期。在avg()函数中计算计算日期和当前订单日期之间的差异:
SELECT o3.account_id, o3.account_name, avg(diff) as average_days_between_orders
FROM
(select o1.id,
o1.account_id,
o1.account_name,
datediff(o1.order_date, max(o2.order_date)) as diff
from orders o1
left join orders o2 on o1.account_id=o2.account_id and o1.id>o2.id
group by o1.id, o1.account_id, o1.account_name, o1.order_date) o3
GROUP BY o3.account_id, o3.account_name
作为连接的替代方法,您可以使用子查询中的用户定义变量或选择列表中的相关子查询来计算差异。您可以检查运行整体解决方案的mysql,以了解此解决方案,例如this SO topic。具体来说,请查看solution provided by Andomar。
如果您的订单表很大,那么从性能的角度来看,该主题中描述的备选方案可能会更好。
答案 1 :(得分:0)
注意:请仔细测试并按需使用。我找不到一个简单的查询。我不保证适用于所有情况:)如果您只想要答案,最终会显示完整的查询。
目标是我会尝试在一行中找到包含开始日期和结束日期的表格,然后我将简单地计算两个日期之间的平均差异。这样的事情。
id | account_id | account_name | start_date | end_date
------------------------------------------------------------
1 | 342 | Kent Brewery | 2015-09-12 | 2015-10-12
2 | 342 | Kent Brewery | 2015-10-12 | 2015-11-12
3 | 342 | Kent Brewery | 2015-11-12 | 2015-12-12
4 | 555 | Acme Fireworks | 2015-06-15 | 2015-09-15
5 | 555 | Acme Fireworks | 2015-09-15 | 2015-12-15
我会创建一些临时表来使它更清晰一些。首次查询start_date:
QUERY:
create temporary table uniq_start_dates
select (@sid := @sid + 1) id, tmp_uniq_start_dates.*
from
(select distinct o1.account_id, o1.account_name, o1.order_date start_date
from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date
order by o1.account_id, o1.order_date) tmp_uniq_start_dates
join (select @sid := 0) AS sid_generator
OUTPUT: temporary table - uniq_start_dates
id | account_id | account_name | start_date
-----------------------------------------------
1 | 342 | Kent Brewery | 2015-09-12
2 | 342 | Kent Brewery | 2015-10-12
3 | 342 | Kent Brewery | 2015-11-12
4 | 555 | Acme Fireworks | 2015-06-15
5 | 555 | Acme Fireworks | 2015-09-15
对end_date执行相同的操作:
QUERY:
create temporary table uniq_end_dates
select (@eid := @eid + 1) id, tmp_uniq_end_dates.*
from
(select distinct o2.account_id, o2.account_name, o2.order_date end_date
from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date
order by o2.account_id, o2.order_date) tmp_uniq_end_dates
join (select @eid := 0) AS eid_generator
OUTPUT: temporary table - uniq_end_dates
id | account_id | account_name | end_date
-----------------------------------------------
1 | 342 | Kent Brewery | 2015-10-12
2 | 342 | Kent Brewery | 2015-11-12
3 | 342 | Kent Brewery | 2015-12-12
4 | 555 | Acme Fireworks | 2015-09-15
5 | 555 | Acme Fireworks | 2015-12-15
如果您注意到,我为每个视图创建了新的自动ID,以便我可以将它们连接回一个表(就像第一个表一样)。我们加入uniq_start_dates和uniq_end_dates。
QUERY:
create temporary table uniq_start_end_dates
select uniq_start_dates.*, uniq_end_dates.end_date
from uniq_start_dates
join uniq_end_dates using (id)
OUTPUT: temporary table - uniq_start_end_dates
(the same one as the first table)
现在这是一个简单的部分。只需汇总并获得平均日期时差。
QUERY:
select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days
from uniq_start_end_dates
group by account_id, account_name
OUTPUT:
account_id | account_name | average_days
--------------------------------------------
342 | Kent Brewery | 30.3333
555 | Acme Fireworks | 91.5000
如果您注意到,Plastic Inc.不在结果中。如果你关心“null”average_days。这是:
QUERY:
select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from
(select distinct account_id, account_name from orders) all_accounts
left join
(select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days
from uniq_start_end_dates
group by account_id, account_name) accounts_with_average_days
using (account_id, account_name)
OUTPUT:
account_id | account_name | average_days
--------------------------------------------
342 | Kent Brewery | 30.3333
555 | Acme Fireworks | 91.5000
900 | Plastic Inc. | null
这是一个完整的混乱查询:
select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from
(select distinct account_id, account_name from orders) all_accounts
left join
(select uniq_start_dates.account_id, uniq_start_dates.account_name, avg(timestampdiff(day, start_date, end_date)) average_days
from
(select (@sid := @sid + 1) id, tmp_uniq_start_dates.*
from
(select distinct o1.account_id, o1.account_name, o1.order_date start_date from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o1.account_id, o1.order_date) tmp_uniq_start_dates join (select @sid := 0) AS sid_generator
) uniq_start_dates
join
(select (@eid := @eid + 1) id, tmp_uniq_end_dates.*
from
(select distinct o2.account_id, o2.account_name, o2.order_date end_date from orders o1
join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o2.account_id, o2.order_date) tmp_uniq_end_dates join (select @eid := 0) AS eid_generator
) uniq_end_dates
using (id)
group by uniq_start_dates.account_id, uniq_start_dates.account_name) accounts_with_average_days
using (account_id, account_name)