Mysql德语口音在全文搜索中不敏感搜索

时间:2010-04-27 14:28:19

标签: mysql encoding full-text-search collation diacritics

我们有一个示例酒店表:

CREATE TABLE `hotels` (
  `HotelNo` varchar(4) character set latin1 NOT NULL default '0000',
  `Hotel` varchar(80) character set latin1 NOT NULL default '',
  `City` varchar(100) character set latin1 default NULL,
  `CityFR` varchar(100) character set latin1 default NULL,
  `Region` varchar(50) character set latin1 default NULL,
  `RegionFR` varchar(100) character set latin1 default NULL,
  `Country` varchar(50) character set latin1 default NULL,
  `CountryFR` varchar(50) character set latin1 default NULL,
  `HotelText` text character set latin1,
  `HotelTextFR` text character set latin1,
  `tagsforsearch` text character set latin1,
  `tagsforsearchFR` text character set latin1,
  PRIMARY KEY  (`HotelNo`),
  FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;

在此表中,我们只有一家酒店,地区名称=“Graubünden”(请注意umlautü字符)

现在我想为短语实现相同的搜索匹配: 'graubunden'和 '格劳宾登'

使用内置的MySql很简单 常规搜索中的排序规则如下:

SELECT *  
FROM `hotels` 
WHERE `Region` LIKE CONVERT(_utf8 '%graubunden%' USING latin1) 
COLLATE latin1_german1_ci

这适用于'graubunden'和'graubünden'和 结果我得到了适当的结果,但问题是 当我们进行MySQL全文搜索时

这条SQL语句有什么问题?:

SELECT 
 *
FROM 
 hotels 
WHERE 
 MATCH (`HotelNo`,`Hotel`,`Address`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST( CONVERT('+graubunden' USING latin1)  COLLATE latin1_german1_ci IN BOOLEAN MODE)            
ORDER BY Country ASC, Region ASC, City ASC

这不会返回任何结果。 狗被埋葬的任何想法?

1 个答案:

答案 0 :(得分:3)

为列定义单个CHARACTER SETS时,将覆盖在表级别设置默认值的排序规则。

每个列都有默认的latin1归类(latin1_swedish_ci)。您可以通过运行SHOW CREATE TABLE来查看它。

FULLTEXT次查询中,索引列的COERCIBILITY0,即所有全文查询都会转换为索引中使用的排序规则,反之亦然。

您需要从列中删除CHARACTER SET个定义,或明确将所有列设置为latin1_german_ci

CREATE TABLE `hotels` (
  `HotelNo` varchar(4) NOT NULL default '0000',
  `Hotel` varchar(80) NOT NULL default '',
  `City` varchar(100) default NULL,
  `CityFR` varchar(100) default NULL,
  `Region` varchar(50) default NULL,
  `RegionFR` varchar(100) default NULL,
  `Country` varchar(50) default NULL,
  `CountryFR` varchar(50) default NULL,
  `HotelText` text,
  `HotelTextFR` text,
  `tagsforsearch` text,
  `tagsforsearchFR` text,
  PRIMARY KEY  (`HotelNo`),
  FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;

INSERT
INTO    hotels (hotelText, HotelTextFR, tagsforsearch, tagsforsearchFR)
VALUES  ('text', 'text', 'graubünden', 'tags');

SELECT  *
FROM    hotels
WHERE   MATCH (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST (CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY
        Country ASC, Region ASC, City ASC;