如何基于单个列从SQL查询中删除重复项

时间:2013-04-16 23:54:26

标签: mysql sql duplicates

我需要修改一个查询。查询目前的作用是根据广告标题和广告说明返回搜索结果(广告)。如果在广告标题或广告说明中找到任何搜索字词,则会返回这些搜索结果

我想修改查询,以便每个广告只针对给定的广告标题在搜索结果中显示一次...因此,如果在搜索中找到的广告标题与广告标题相同,则应返回该广告标题只有1个广告...

$sql = "SELECT a.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct.cityname,
                    COUNT(*) AS piccount, p.picfile,
                    scat.subcatname, cat.catid, cat.catname $xfieldsql
                FROM t_ads a
                    INNER JOIN t_cities ct ON a.cityid = ct.cityid
                    INNER JOIN t_subcats scat ON a.subcatid = scat.subcatid
                    INNER JOIN t_cats cat ON scat.catid = cat.catid
                    LEFT OUTER JOIN t_adxfields axf ON a.adid = axf.adid
                    LEFT OUTER JOIN t_adpics p ON a.adid = p.adid AND p.isevent = '0'
                    LEFT OUTER JOIN t_featured feat ON a.adid = feat.adid AND feat.adtype = 'A'
                WHERE $where
                    AND $visibility_condn
                    AND (feat.adid IS NULL OR feat.featuredtill < NOW())
                    $loc_condn
                GROUP BY a.adid
                ORDER BY a.createdon DESC
                LIMIT $offset, $ads_per_page";

编辑:$ where包含搜索表达式...如果启用正则表达式搜索它使用正则表达式,否则不... ... $ sqlsearch包含用户输入的搜索词...

if ($regex_search) {
                $where = "(a.adtitle RLIKE '[[:<:]]{$searchsql}[[:>:]]' OR a.addesc RLIKE '[[:<:]]{$searchsql}[[:>:]]')";
            } else {
                $where = "(a.adtitle LIKE '$searchsql' OR a.addesc LIKE '$searchsql')";

1 个答案:

答案 0 :(得分:3)

执行此操作的“正确”方法是通过找出重复项首先出现的原因来解决路径问题。它将与JOINs有关,但没有查看我无法回答的数据。但是,如果你想要一个快速(ish)和脏的方法来删除重复项,可以尝试类似下面的内容。

免责声明:这是完全未经测试的,因此更有可能是这里的一两个错误 - 但希望没有任何交易破坏者。

SELECT a2.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct2.cityname,
       COUNT(*) AS piccount, p2.picfile,
       scat2.subcatname, cat2.catid, cat2.catname $xfieldsql
FROM
   (SELECT subq1.title, MIN(subq1.adid) AS adid
    FROM 
           (SELECT a.*, UNIX_TIMESTAMP(a.createdon) AS timestamp, ct.cityname,
                COUNT(*) AS piccount, p.picfile,
                scat.subcatname, cat.catid, cat.catname
            FROM t_ads a
                INNER JOIN t_cities ct ON a.cityid = ct.cityid
                INNER JOIN t_subcats scat ON a.subcatid = scat.subcatid
                INNER JOIN t_cats cat ON scat.catid = cat.catid
                LEFT OUTER JOIN t_adxfields axf ON a.adid = axf.adid
                LEFT OUTER JOIN t_adpics p ON a.adid = p.adid AND p.isevent = '0'
                LEFT OUTER JOIN t_featured feat ON a.adid = feat.adid AND feat.adtype = 'A'
            WHERE $where
                AND $visibility_condn
                AND (feat.adid IS NULL OR feat.featuredtill < NOW())
                $loc_condn
            GROUP BY a.adid) subq1
    GROUP BY subq.title) subq2
INNER JOIN t_ads a2 ON a2.adid = subq2.adid
INNER JOIN t_cities ct2 ON a2.cityid = ct2.cityid
INNER JOIN t_subcats scat2 ON a2.subcatid = scat2.subcatid
INNER JOIN t_cats cat2 ON scat2.catid = cat2.catid
LEFT OUTER JOIN t_adxfields axf2 ON a2.adid = axf2.adid
LEFT OUTER JOIN t_adpics p2 ON a2.adid = p2.adid AND p2.isevent = '0'
LEFT OUTER JOIN t_featured feat2 ON a2.adid = feat2.adid AND feat2.adtype = 'A'
ORDER BY a2.createdon DESC
LIMIT $offset, $ads_per_page

这可以大规模简化和整理,例如通过从子查询中删除一些东西,但我只是给出一般的想法(希望)让你起步并运行......

说明

subq2只是按标题分组,并从每个组中挑选adid(选择在此使用MIN,但可以使用MAX代替。

subq1是原始查询,但已删除排序和限制,因为外部查询会应用这些排序和限制。

外部查询连接到已删除的ID并连接回广告和其他表(为它们提供不同的别名),以便从原始查询中选择字段。