MySQL&正则表达式:仅匹配整个单词还是跳过URL?

时间:2011-02-22 15:09:50

标签: php mysql regex

我使用下面的查询进行搜索,

SELECT
pg_id AS ID, 
pg_url AS URL,
pg_title AS Title,
pg_content_1 AS Content_1,
pg_content_2 AS Content_2,
parent_id AS Parent_id,

EXTRACT(DAY FROM pg_created) AS Date,
EXTRACT(MONTH FROM pg_created) AS Month,
EXTRACT(YEAR FROM pg_created) AS Year

FROM root_pages

WHERE root_pages.pg_cat_id = '2'
AND root_pages.parent_id != root_pages.pg_id
AND root_pages.pg_hide != '1'
AND root_pages.pg_url != 'cms'
AND root_pages.pg_content_1 REGEXP '[[:<:]]".$search."[[:>:]]'
OR root_pages.pg_content_2 REGEXP '[[:<:]]".$search."[[:>:]]'

ORDER BY root_pages.pg_created DESC

我工作正常,但我不希望它在URL地址中搜索关键字,例如,

如果我搜索主页的关键字,查询将搜索以下网址中“主页”的所有匹配项,并将其作为结果返回:

http://epp.eurostat.ec.europa.eu/xx/eurostat/ /

http://ec.europa.eu/ -affairs / doc_centre / xx.pdf

如何修复查询,使其与网址中的关键字不匹配,或者只与整个字匹配?

感谢。

2 个答案:

答案 0 :(得分:0)

尝试使用MySQL的strip_tags实现

CREATE FUNCTION strip_tags( x longtext) RETURNS longtext
LANGUAGE SQL NOT DETERMINISTIC READS SQL DATA
BEGIN
    DECLARE sstart INT UNSIGNED;
    DECLARE ends INT UNSIGNED;
    SET sstart = LOCATE('<', x, 1);
    REPEAT
        SET ends = LOCATE('>', x, sstart);
        SET x = CONCAT(SUBSTRING( x, 1 ,sstart -1) ,SUBSTRING(x, ends +1 )) ;
        SET sstart = LOCATE('<', x, 1);
    UNTIL sstart < 1 END REPEAT;
return x;
END;

(找到here

然后您的查询将如下所示:

AND STRIP_TAGS(root_pages.pg_content_1) REGEXP '[[:<:]]".$search."[[:>:]]'
OR STRIP_TAGS(root_pages.pg_content_2) REGEXP '[[:<:]]".$search."[[:>:]]'

但是,这种实现可能很慢且不可靠,所以我建议采用不同的方法:

  1. 创建一个新列,例如search
  2. 为表格中的每一行插入raw(= php strip_tagged)搜索数据(甚至可能是您用于搜索的两列的组合)
  3. 如果您愿意,可以使用此类查询 - SELECT col1, col2 FROM table WHERE search LIKE '%your_search_expression%'甚至fulltext keys。 (使用LIKE代替REGEXP,因为它是faster)。

答案 1 :(得分:0)

修改

找到了我自己的解决方案,它似乎运行正常,因为它跳过了 http://

列中的文字搜索
SELECT
pg_id AS ID, 
pg_url AS URL,
pg_title AS Title,
pg_content_1 AS Content_1,
pg_content_2 AS Content_2,
parent_id AS Parent_id,

EXTRACT(DAY FROM pg_created) AS Date,
EXTRACT(MONTH FROM pg_created) AS Month,
EXTRACT(YEAR FROM pg_created) AS Year

FROM root_pages

WHERE root_pages.pg_cat_id = '2'
AND root_pages.parent_id != root_pages.pg_id
AND root_pages.pg_hide != '1'
AND root_pages.pg_url != 'cms'
AND root_pages.pg_content_1 LIKE '%".$search."%'
OR root_pages.pg_content_2 LIKE '%".$search."%'

AND root_pages.pg_content_1 NOT LIKE '%http://%'
AND root_pages.pg_content_2 NOT LIKE '%http://%'

ORDER BY root_pages.pg_created DESC