我有一些查询,我认为可以在mySQL中对其进行重构和简化,但是我不确定该怎么做。我正在以编程方式进行此操作,但是我确定可以加快此速度。
基本上从用户那里获得一个ID,在db中查找并获得可能具有类似标签作为给定参数的行的ID。确保排除原始参数,并且不包括任何重复的ID。
有没有办法在纯sql中做到这一点?
这是我当前的代码:
def getRelatedEvents(self, memberId, eventId):
relatated_events = []
# first we get all the tags related to this event
for tag in self.db.query("select tagName from event_tags where eventId={}".format(eventId)):
# we iterate through each tag and find the eventIds for it
events = self.db.query("SELECT eventId from event_tags where tagName LIKE %s and eventId != %s LIMIT 3",
'%'+tag['tagName']+'%', eventId)
# we group them in a list, excluding ones that are already in here
for id in events:
if id['eventId'] not in relatated_events:
relatated_events.append(id['eventId'])
# we get the extra event info for each item in the related list and return
return [self.getSpecificEvent(memberId, item) for item in relatated_events]
答案 0 :(得分:2)
您应该能够通过自我加入来实现这一目标,例如:
SELECT DISTINCT e2.eventId
FROM event_tags e1
INNER JOIN event_tags e2
ON e2.tagName LIKE CONCAT('%', e1.tagName, '%') AND e2.eventId != e1.eventId
WHERE e1.eventId = {}
我注意到第二个查询有一个LIMIT 3
子句。首先,请注意,没有ORDER BY
子句不会产生可预测的结果。这是一个基于窗口函数ROW_NUMBER()
(在MySQL 8中可用)的解决方案,它将为每个匹配的标签产生不超过3个event_id
:
SELECT DISTINCT event_id FROM (
SELECT e2.eventId, ROW_NUMBER() OVER(PARTITION BY e1.eventId ORDER BY e2.eventId) rn
FROM event_tags e1
INNER JOIN event_tags e2
ON e2.tagName LIKE CONCAT('%', e1.tagName, '%') AND e2.eventId != e1.eventId
WHERE e1.eventId = {}
) WHERE rn <= 3