我正在使用PHP和MySQL。 任何人都能告诉我一种基于优先级过滤掉重复结果的有效方法吗?
示例:
我有一张桌子:
ID | Priority 1 | Priority 2 | Priority 3 | E-Mail
--------------------------------------------------------------
1 | Apple | One | Low | abc@abc.com
2 | Banana | Two | Medium | def@abc.com
3 | Banana | Two | High | def@abc.com
4 | Banana | Two | High | def@abc.com
5 | Peach | Three | Low | ghi@abc.com
6 | Peach | Four | High | ghi@abc.com
在上面的例子中,我正在寻找一种只获得行1,3(或4)和6的方法
也就是说,由于行2,3,4和5,6的电子邮件相同,因此它们是重复记录。我想根据优先级选择记录
如果重复记录的优先级1相同,我移动到优先级2.如果它也相同,我然后转到优先级3.如果相同,那么我选择哪个并不重要。
但是,如果存在差异,我选择优先级较高的记录。
在上面的例子中,优先级是
Peach -> Banana -> Apple
Four -> Three -> Two -> One
High -> Medium -> Low
然后我会将结果插入到不同的数据库中。
到目前为止,我有一个查询来获取非重复项。我正在考虑有第二个查询来处理重复项 第一个查询处理大约20,000条记录。第二个查询将处理大约5,000条记录。
但是,我不确定实现这一目标的有效方法。
我非常感谢任何帮助。
谢谢。
编辑:错字:想要行1,3 / 4和6(不是1,2和6)
答案 0 :(得分:0)
此查询应该为您提供所需的结果:
SELECT
MIN(ID),
EMail,
MIN(Priority1),
MIN(Priority2),
MIN(Priority3)
FROM
yourtable
WHERE
(EMail, Priority1, Priority2, FIELD(Priority3, 'High', 'Medium', 'Low')) IN (
SELECT
EMail,
MIN(Priority1),
MIN(Priority2),
MIN(FIELD(Priority3, 'High', 'Medium', 'Low')) MinP3
FROM
yourtable
WHERE
(EMail, Priority1, FIELD(Priority2, 'Four', 'Three', 'Two', 'One')) IN (
SELECT
EMail,
MIN(Priority1),
MIN(FIELD(Priority2, 'Four', 'Three', 'Two', 'One')) MinP2
FROM
yourtable
WHERE
(EMail, FIELD(Priority1, 'Peach', 'Banana', 'Apple')) IN
(SELECT
EMail, MIN(FIELD(Priority1, 'Peach', 'Banana', 'Apple')) MinP1
FROM
yourtable
GROUP BY
EMail)
GROUP BY
EMail)
GROUP BY
EMail)
GROUP BY
EMail
(我正在返回第3行而不是第2行,但如果我正确理解你的问题那么它应该是正确的)。请参阅小提琴here。我怀疑它不会很快。我仍然想知道是否有办法让它更快。
修改
您可以尝试以下查询。它使用不同的逻辑,但它也使用带有一些idexed列的Priorities表,它们应该比FIELD函数快得多,但是有很多连接可能会使查询变慢一些。
CREATE TABLE Priorities (
Num INT,
Des VARCHAR(10),
Priority INT,
PRIMARY KEY (Num, Des)
);
INSERT INTO Priorities VALUES
(1, 'Peach', 1),
(1, 'Banana', 2),
(1, 'Apple', 3),
(2, 'Four', 1),
(2, 'Three', 2),
(2, 'Two', 3),
(2, 'One', 4),
(3, 'High', 1),
(3, 'Medium', 2),
(3, 'Low', 3);
SELECT MIN(ID), yourtable.Email, MIN(Priority1) Priority1, MIN(Priority2) Priority2, MIN(Priority3) Priority3
FROM
yourtable
INNER JOIN Priorities p1 ON yourtable.Priority1=p1.Des AND p1.Num=1
INNER JOIN Priorities p2 ON yourtable.Priority2=p2.Des AND p2.Num=2
INNER JOIN Priorities p3 ON yourtable.Priority3=p3.Des AND p3.Num=3
INNER JOIN (
SELECT s1.EMail, MIN(MinP1) M1, MIN(MinP2) M2, MIN(MinP3) M3
FROM (
SELECT EMail, MIN(p1.Priority) MinP1
FROM yourtable INNER JOIN Priorities p1
ON yourtable.Priority1 = p1.Des AND p1.Num = 1
GROUP BY EMail) s1
INNER JOIN (
SELECT EMail, p1.Priority Pr1, MIN(p2.Priority) MinP2
FROM yourtable INNER JOIN Priorities p1
ON yourtable.Priority1 = p1.Des AND p1.Num = 1
INNER JOIN Priorities p2
ON yourtable.Priority2 = p2.Des AND p2.Num = 2
GROUP BY EMail, p1.Priority) s2
ON s1.EMail=s2.EMail AND s1.MinP1=s2.Pr1
INNER JOIN (
SELECT EMail, p1.Priority Pr1, p2.Priority Pr2, MIN(p3.Priority) MinP3
FROM yourtable INNER JOIN Priorities p1
ON yourtable.Priority1 = p1.Des AND p1.Num = 1
INNER JOIN Priorities p2
ON yourtable.Priority2 = p2.Des AND p2.Num = 2
INNER JOIN Priorities p3
ON yourtable.Priority3 = p3.Des AND p3.Num = 3
GROUP BY EMail, p1.Priority, p2.Priority) s3
ON s1.Email=s3.Email AND s1.MinP1=s3.Pr1 AND s2.MinP2=s3.Pr2
GROUP BY
s1.EMail) s
ON yourtable.EMail=s.Email
AND p1.Priority=s.M1
AND p2.Priority=s.M2
AND p3.Priority=s.M3
GROUP BY
yourtable.EMail
请参阅小提琴here。如果它仍然太慢,我们可以尝试使用支持表的第一个查询,如第二个。或者我们应该将查询分为两部分。