如何使用Activerecord或SQL在任何列中查找具有重复值的记录?
SELECT leads.id, leads.name, leads.email, leads.created_at, array_agg(tn2.id) as ids
FROM "leads" join leads tn2
on leads.name = tn2.name
or leads.cpf_cnpj = tn2.cpf_cnpj
or leads.email = tn2.email
or leads.phone -> 'cellphone' = tn2.phone -> 'cellphone'
or leads.phone -> 'residence' = tn2.phone -> 'residence'
or leads.phone -> 'commercial' = tn2.phone -> 'commercial'
GROUP BY leads.id ORDER BY leads.created_at DESC
使用array_agg
我只想要来自重复对象的id,但它从所有记录中提供给我。
目前,我正在使用PostgreSQL。
答案 0 :(得分:1)
如何在任何列中查找具有重复值的记录?
SELECT l.id, l.name, l.email, l.created_at, array_agg(l2.id) AS ids
FROM leads l
WHERE EXISTS (
SELECT 1
FROM leads
WHERE id <> l.id
AND (
name = l.name
OR cpf_cnpj = l.cpf_cnpj
OR email = l.email
OR phone->'cellphone' = l.phone->'cellphone'
OR phone->'residence' = l.phone->'residence'
OR phone->'commercial' = l.phone->'commercial'
)
);
但似乎你想要不同的东西:
如何从几个给定列中的至少一个列中具有相同值的行中获取每行的ID数组,最先输入的是哪个?
SELECT l.id, l.name, l.email, l.created_at
, array_agg(l2.id ORDER BY l2.created_at DESC NULL LAST) AS dupe_ids
FROM leads l
JOIN leads l2 ON l2.id <> l.id
AND (
l2.name = l.name
OR l2.cpf_cnpj = l.cpf_cnpj
OR l2.email = l.email
OR l2.phone->'cellphone' = l.phone->'cellphone'
OR l2.phone->'residence' = l.phone->'residence'
OR l2.phone->'commercial' = l.phone->'commercial'
)
GROUP BY l.id
ORDER BY l.created_at DESC NULL LAST;
假设id
是主键。