RegEx匹配特定单词,除非它是句子中的最后一个单词(titleize)

时间:2012-07-22 10:47:41

标签: php regex

我正在大写所有单词,然后小写单词,如 a 。 第一个和最后一个词应保持大写。 我尝试使用\s代替\ b,这引起了一些奇怪的问题。 我也试过[^$],但这似乎并不意味着“不是字符串结束”

function titleize($string){
  return ucfirst(
     preg_replace("/\b(A|Of|An|At|The|With|In|To|And|But|Is|For)\b/uie",
     "strtolower('$1')", 
     ucwords($string))
  );
}

这是我试图解决的唯一失败测试。最后的“in”应保持大写。

titleize("gotta give up, gotta give in");
//Gotta Give Up, Gotta Give In

这些测试通过:

titleize('if i told you this was killing me, would you stop?');
//If I Told You This Was Killing Me, Would You Stop?

titleize("we're at the top of the world (to the simple two)");
//We're at the Top of the World (to the Simple Two)

titleize("and keep reaching for those stars");
//And Keep Reaching for Those Stars

3 个答案:

答案 0 :(得分:1)

在将字符串发送到regex-replace之前应用ucwords(),然后在从regex返回后再次ucfirst(对于出现在字符串开头的单词)。 惯例可以缩短字符串的开头和结尾的每个单词都不会被空格包围。使用此约定,我们可以使用像'/(?<=\s)( ... )(?=\s)/'这样的正则表达式。这将以某种方式简化您的功能:

function titleize2($str) {
 $NoUc = Array('A','Of','An','At','The','With','In','To','And','But','Is','For');
 $reg = '/(?<=\s)('      # set lowercase only if surrounded by whitespace
      . join('|', $NoUc) # add OR'ed list of words
      . ')(?=\s)/e';     # set regex-eval mode
 return preg_replace( $reg, 'strtolower("\\1")', ucwords($str) );
}

如果测试:

...
$Strings = Array('gotta give up, gotta give in',
                 'if i told you this was killing me, would you stop?',
                 'we\'re at the top of the world (to the simple two)',
                 'and keep reaching for those stars');

foreach ($Strings as $s)
   print titleize2($s) . "\n";
...

...这将返回正确的结果。

答案 1 :(得分:0)

试试这个正则表达式:

/\b(A|Of|An|At|The|With|In|To|And|But|Is|For)(?!$)\b/uie

否定前瞻(?!$)会排除内容跟随的匹配。

答案 2 :(得分:0)

为行尾(?!$)添加否定前瞻应该可以执行您想要的操作

function titleize($string){
  return ucfirst(
     preg_replace("/\b(A|Of|An|At|The|With|In|To|And|But|Is|For)\b(?!$)/uie",
     "strtolower('$1')", 
     ucwords(inflector::humanize($string)))
  );
}