我有以下查询,其中我检索特定商品的销售数量和每天销售的平均价格。
SELECT COUNT(1) AS num_sales, DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date, AVG(prices.price) AS avg_price
FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503 AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC
我还有一个for循环,每天都会单独查询以获得中位数价格(让我们假设结果的数量是偶数):
SELECT prices.price FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503
AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
AND DATE(sales.created_at) = "<THE DATE OF THE CURRENT FOR-LOOP OBJECT>"
ORDER BY prices.price ASC
LIMIT 1 OFFSET <NUMBER OF THE MIDDLE ROW>
可以想象,这非常慢,因为在某些情况下,必须在大型表上进行数百次查询(销售表有几亿行)。
如何重写第一个SQL查询,以便它还计算prices.price
的中位数,类似于AVG(prices.price)
?我已经查看了诸如this one之类的答案,但无法解决如何针对我的具体情况进行调整。
我花了好几个小时试图完成这个,但我的SQL知识根本不够好。任何帮助将不胜感激!
root@ns525077:~# mysql -V
mysql Ver 14.14 Distrib 5.7.13, for Linux (x86_64) using EditLine wrapper
表模式:
CREATE TABLE `prices` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`item_id` int(11) unsigned NOT NULL,
`price` decimal(8,2) NOT NULL,
`net_price` decimal(8,2) NOT NULL,
`source` tinyint(4) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
KEY `prices_ibfk_1` (`item_id`),
CONSTRAINT `prices_ibfk_1` FOREIGN KEY (`item_id`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=4861375 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `sales` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`price_id` int(11) unsigned DEFAULT NULL,
`item_key` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
UNIQUE KEY `item_key` (`item_key`),
KEY `price_id` (`price_id`),
KEY `created_at` (`created_at`),
KEY `price_id__created_at__IX` (`price_id`,`created_at`),
CONSTRAINT `sales_ibfk_1` FOREIGN KEY (`price_id`) REFERENCES `prices` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=386156944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
我的第一个查询的输出示例:
答案 0 :(得分:0)
经过广泛搜索后,我找到了问题here的答案。也许我最初没有说出我的问题。
我已经根据自己的情况调整了解决方案,这里是工作查询:
SELECT COUNT(1) AS num_sales,
DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date,
AVG(prices.price) AS avg_price,
CASE(COUNT(1) % 2)
WHEN 1 THEN SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat(prices.price
ORDER BY prices.price SEPARATOR ',')
, ',', (count(*) + 1) / 2)
, ',', -1)
ELSE (SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat(prices.price
ORDER BY prices.price SEPARATOR ',')
, ',', count(*) / 2)
, ',', -1)
+ SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat(prices.price
ORDER BY prices.price SEPARATOR ',')
, ',', (count(*) + 1) / 2)
, ',', -1)) / 2
END median_price
FROM sales
INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7381
AND (`prices`.`source` = 0
OR (`prices`.`price` >= 400
AND `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC;