在SQL中有一个查询我严重陷入困境,我已尽力尝试但无法获得解决方案。我有4个表名为:user,item,buys,rates。
CREATE TABLE User (
id integer,
name varchar(30),
Primary Key (id)
)
INSERT INTO User(id, name)
VALUES
('1', 'Lorren'),
('2', 'Smith'),
('3', 'Stephen'),
('4', 'David'),
('5', 'Sophie'),
('6', 'Alex'),
('7', 'Henry'),
('8', 'Jasmine'),
('9', 'Anderson'),
('10', 'Bilal')
CREATE TABLE Item (
id integer,
description varchar(50),
category varchar(30),
price integer,
Primary Key (id)
)
INSERT INTO Item(id, description, category, price)
VALUES
('50', 'Princess Diary', 'Book', '8'),
('51', 'Frozen', 'Book', '4'),
('52', 'Tangled', 'Book', '3'),
('53', 'Oak Table', 'Furniture', '370'),
('54', 'Doble Bed', 'Furniture', '450'),
('55', 'Metal Cupboard', 'Furniture', '700'),
('56', 'Levi 501', 'Clothes', '90'),
('57', 'Corduroy Coat', 'Clothes', '230'),
('58', 'Straight Trousers', 'Clothes', '45'),
('59', 'Black Sequin Top', 'Clothes', '85')
CREATE TABLE Buys (
user integer,
item integer,
price integer,
Primary Key (user, item),
Foreign key (user) REFERENCES User(id),
Foreign Key (item) REFERENCES Item(id)
)
INSERT INTO Buys
VALUES ('1', '52', '3'),
('1', '56', '90'),
('2','56','100'),
('2', '54', '450'),
('5', '53', '400'),
('5', '55', '700'),
('5', '59', '90'),
('6', '57', '230'),
('10', '58', '50'),
('8', '50', '8')
CREATE TABLE Rates (
user integer,
item integer,
rating integer CHECK (0<=rating<=5),
Primary Key (user, item),
Foreign key (user) REFERENCES User(id),
Foreign Key (item) REFERENCES Item(id)
)
INSERT INTO Rates
VALUES
('1', '52', '5'),
('1', '56', '3'),
('2', '54', '5'),
('2', '55', '4'),
('2', '56', '2'),
('5', '53', '5'),
('5', '55', '5'),
('8', '50', '1'),
('8', '55', '3'),
('9', '55', '4')
我必须针对每个用户找到他未购买的所有物品,但仅显示其中具有最高平均等级的物品/物品。因此,结果应仅显示那些未被他购买且具有最高平均评级的项目/项目。评分是1-5,并且每个项目可能具有不同的评级,因此可以计算每个评级的平均评级,但是我无法找到每个用户未获得最高平均评级的项目。我在MYSQL工作,我被困在这里6天,甚至我的朋友们都试过没人能解决它。有人可以帮忙吗?
考虑到当前表格的预期输出应该是这样的:
User Items With Highest Average
Lorren 53
Lorren 54
Smith 52
Smith 53
Stephen 52
Stephen 53
Stephen 54
David 52
David 53
David 54
Sophie 52
Sophie 54
Alex 52
Alex 53
Alex 54
Henry 52
Henry 53
Henry 54
Jasmine 52
Jasmine 53
Jasmine 54
Anderson 52
Anderson 53
Anderson 54
Bilal 52
Bilal 53
Bilal 54
答案 0 :(得分:1)
好吧,绝对不是我最漂亮的工作,特别是因为我通常不在MySQL工作(编辑:SQLFiddle备份。修复了一个内部组,现在这个工作):
SELECT topItemsAllUsers.* FROM
(SELECT
u.id AS userId,
u.name,
topItems.itemId
FROM
(SELECT
iwa.id AS itemId
FROM
(SELECT
MAX(AverageRating) AS MaxRating
FROM
(SELECT
i.id,
AVG(COALESCE(r.rating, 0)) AS AverageRating
FROM Item i
LEFT JOIN Rates r ON r.item = i.id
GROUP BY i.id
) AS averages
) AS MaxOuterRating
INNER JOIN
(SELECT
i.id,
AVG(COALESCE(r.rating, 0)) AS AverageRating
FROM Item i
LEFT JOIN Rates r ON r.item = i.id
GROUP BY i.id
) as iwa ON iwa.AverageRating = MaxOuterRating.MaxRating
) as topItems
CROSS JOIN
User u
) as topItemsAllUsers
LEFT JOIN Buys b ON topItemsAllUsers.userId = b.user AND topItemsAllUsers.itemId = b.item
WHERE b.user IS NULL
在TSQL中,我至少会在平均评级表中使用CTE。这比最初看起来要困难得多!
编辑:一些解释如下。首先要得到的是每个项目的平均评分,对于没有评级的项目使用0(因此COALESCE()
语句):
(SELECT
i.id,
AVG(COALESCE(r.rating, 0)) AS AverageRating
FROM Item i
LEFT JOIN Rates r ON r.item = i.id
GROUP BY i.id)
这将使用其平均评分列出每个项目ID一次。我将此命名为averages
,我实际上使用了它两次查询(第二次命名为iwa
。我不记得“iwa”应该是什么意思了......),一次到获得实际最高评级:
SELECT
MAX(AverageRating) AS MaxRating
FROM averages
并将MaxOuterRating
命名为INNER JOIN
,然后iwa
将结果重新导回AverageRating = MaxRating
,SELECT
iwa.itemId
FROM
MaxOuterRating
INNER JOIN iwa ON iwa.AverageRating = MaxOuterRating.MaxRating
,以获取评分最高的项目:
topItems
此结果包含在CROSS JOIN
别名中。
现在我们只有最高评分的项目User
和SELECT
...
FROM
topItems
CROSS JOIN
Users
,以获得包含每个用户的每个热门项目的表格:
topItemsAllUsers
此结果位于LEFT JOIN
。
最后,对用户ID和项ID进行Buys
Buys
,然后将结果限制为仅关联SELECT
topItemsAllUsers.*
FROM
topItemsAllUsers
LEFT JOIN Buys b ON topItemsAllUsers.userId = b.user AND topItemsAllUsers.itemId = b.item
WHERE b.user IS NULL
个记录的行(通常称为排除加入):
{{1}}
Et viola。这些操作都不是特别困难,但它们的嵌套非常严重,很难看出如何攻击。我不怀疑这可以大大改进,但 会返回预期的结果。
答案 1 :(得分:1)
因此,对于初学者来说,各个用户未购买的商品清单如下,对吧?
SELECT u.*
, i.*
FROM user u
JOIN item i
LEFT
JOIN buys b
ON b.user = u.id
AND b.item = i.id
WHERE b.item IS NULL;
......在这种情况下......
SELECT x.* FROM
(
SELECT u.id user_id
, u.name
, i.id item_id
, i.description
, i.category
, i.price
, r.rating
FROM user u
JOIN item i
LEFT
JOIN buys b
ON b.user = u.id AND b.item = i.id
JOIN rates r
ON r.item = i.id
WHERE b.item IS NULL
) x
JOIN
(
SELECT u.id,r.rating
FROM user u
JOIN item i
LEFT
JOIN buys b
ON b.user = u.id AND b.item = i.id
JOIN rates r
ON r.item = i.id
JOIN (SELECT AVG(rating) max_avg FROM rates GROUP BY item ORDER BY AVG(rating) DESC LIMIT 1) n
ON n.max_avg = r.rating
WHERE b.item IS NULL
GROUP
BY u.id
) y
ON y.id = x.user_id
AND y.rating = x.rating
ORDER
BY user_id,item_id;
...应该产生预期的结果
编辑将Paul Griffin的观察结合起来,虽然这样做,但我可能会使查询比它需要的更复杂。