mysql:最好的说法"字符串直到任何以下关键字的第一个实例"?

时间:2015-02-23 12:32:44

标签: mysql regex locate

我可以想到在mysql中这样做的复杂和丑陋的方法,但我正在寻找一个好方法。假设我有一堆学校名称,比如

Meopham County Infant School
Speldhurst Nursery School
Rainbow Pre-School
The Annex School House
Fleet Learning Zone
Dartford Grammar School
Kiddliwinks
Hextable Kindergarten
The Rocking Horse Montessori Kinder
Little Angels Day Nursery

我有一个停用词列表:

["school", "primary", "nursery", "college", "junior", "church", "cofe", "community", "infant"]

我有一个ruby函数“short_name”,它返回学校名称,但不包括,任何一个停用词的第一个实例,以便我们得到

"Bower Grove School" => "Bower Grove"
"Fulston Manor School" => "Fulston Manor"
"St Johns Church Hall Play" => "St Johns"
"St Botolph's Church of England Voluntary Aided Primary School" => "St Botolph's"
"Fawkham House School" => "Fawkham House"
"Silverdale Day Nursery" => "Silverdale Day"
"Vigo Village School" => "Vigo Village"
"Sevenoaks Primary School" => "Sevenoaks"
"High Weald Academy" => "High Weald Academy"
"The Ebbsfleet Academy" => "The Ebbsfleet Academy"

这一切都很好。我的问题是:在mysql中进行上述字符串处理的最简单方法是什么?

例如,如果我想通过这个short_name搜索,我想做类似

的事情
"select * from schools where <function(name)> = 'Bower Grove'"

最简单的<function>方式是什么?我认为使用正则表达式的substring()和locate()的某种组合将是可行的方法,但看起来我不能使用带有locate的正则表达式。

我猜正则表达式是

"school|primary|nursery|college|junior|church|cofe|community|infant"

谢谢,Max

1 个答案:

答案 0 :(得分:2)

MySQL确实支持正则表达式。不幸的是,它仅用于匹配。

这是一种方法:

select least(substring_index(schoolname, ' School', 1),
             substring_index(schoolname, ' Primary', 1),
             . . .
            )

这使用substring_index()在分隔符之前提取字符串的第一部分。如果分隔符不存在,则获得整个字符串。然后least()函数将选择最短的字符串。

这假定该关键字前面有空格。毕竟,你可能不想彻底消除像“小天使学校”这样的名字。