大写姓氏,不包括姓氏的小写前缀部分

时间:2017-01-20 15:22:22

标签: php utf-8

我正在尝试确定一个大写姓氏的方法;但是,不包括小写前缀。

名称及其转换的示例:

  • 麦克阿瑟 - >麦克阿瑟
  • McDavid - > McDAVID
  • LeBlanc - > LEBLANC
  • McIntyre - >麦金太尔
  • de Wit - > de WIT

还有一些名称包含需要完全大写的姓氏,因此识别前缀的简单函数(例如strchr())是不够的:

  • Macmaster - > MACMASTER
  • Macintosh - > MACINTOSH

PHP函数mb_strtoupper()不合适,因为它将整个字符串大写。同样地,strtoupper()不合适,并且在重音名称上也会失去重音。

围绕SO有一些答案部分回答了这个问题,例如:Capitalization using PHP 但是,常见的不足是假设姓氏为Mac的所有姓名都跟随一个大写。

名称在数据库中正确大写,因此我们可以假设拼写为Macarthur的名称是正确的,或者麦克阿瑟对另一个人是正确的。

7 个答案:

答案 0 :(得分:8)

遵循规则最后大写字母之后的所有内容都大写:

preg_replace_callback('/\p{Lu}\p{Ll}+$/u', 
                      function ($m) { return mb_strtoupper($m[0]); },
                      $name)

\p{Lu}\p{Ll}分别是Unicode大写和小写字符,mb_strtoupper是unicode感知的...对于一个简单的ASCII-only变体,这也会这样做:

preg_replace_callback('/[A-Z][a-z]+$/', 
                      function ($m) { return strtoupper($m[0]); },
                      $name)

答案 1 :(得分:2)

我相信这是问题的解决方案:

$names = array(
    'MacArthur',
    'Macarthur',
    'ÜtaTest',
    'de Wit'
);

$pattern = '~(?<prefix>(?:\p{Lu}.+|.+\s+))(?<suffix>\p{Lu}.*)~';
foreach ($names as $key => $name) {
    if (preg_match($pattern, $name, $matches)) {
        $names[$key] = $matches['prefix'] . mb_strtoupper($matches['suffix']);
    } else {
        $names[$key] = mb_strtoupper($name);
    }
}

print_r($names);

它为上面的输入数组产生以下结果:

Array
(
    [0] => MacARTHUR
    [1] => MACARTHUR
    [2] => ÜtaTEST
    [3] => de WIT
)

正则表达式的简要说明:

(?<prefix>             # name of the captured group
   (?:                 # ignore this group
       \p{Lu}.+        # any uppercase character followed by any character
       |               # OR
       .+\s+           # any character followed by white space
   )
)
(?<suffix>             # name of the captured group
    \p{Lu}.*           # any uppercase character followed by any character
)

答案 2 :(得分:2)

Here's a basic algorithm that avoids cryptic regular expressions:

  1. Create a multibyte-safe character array for the literal surname (as it exists in the database).
  2. Create a second character array in multibyte-safe capitalized form.
  3. Intersect both arrays to determine the index of the final capitalized character.
  4. Concatenate the literal surname through the index with the capitalized form after the index.

In code form:

<?php
$names = [
    'MacArthur',
    'McDavid',
    'LeBlanc',
    'McIntyre',
    'de Wit',
    'Macmaster',
    'Macintosh',
    'MacMac',
    'die Über',
    'Van der Beek',
    'johnson',
    'Lindström',
    'Cehlárik',
];

// Uppercase after the last capital letter
function normalizeSurname($name) {
    // Split surname into a Unicode character array
    $chars = preg_split('//u', $name, -1, PREG_SPLIT_NO_EMPTY);

    // Capitalize surname and split into a character array
    $name_upper = mb_convert_case($name, MB_CASE_UPPER);
    $chars_upper = preg_split('//u', $name_upper, -1, PREG_SPLIT_NO_EMPTY);

    // Find the index of the last capitalize letter
    @$last_capital_idx = array_slice(array_keys(array_intersect($chars, $chars_upper)), -1)[0] ?: 0;

    // Concatenate the literal surname up to the index, and capitalized surname thereafter
    return mb_substr($name, 0, $last_capital_idx) . mb_substr($name_upper, $last_capital_idx);
}

// Loop through the surnames and display in normalized form
foreach($names as $name) {
    echo sprintf("%s -> %s\n", 
        $name,
        normalizeSurname($name)
    );
}

You'll get output like:

MacArthur -> MacARTHUR
McDavid -> McDAVID
LeBlanc -> LeBLANC
McIntyre -> McINTYRE
de Wit -> de WIT
Macmaster -> MACMASTER
Macintosh -> MACINTOSH
MacMac -> MacMAC
die Über -> die ÜBER
Van der Beek -> Van der BEEK
johnson -> JOHNSON
Lindström -> LINDSTRÖM
Cehlárik -> CEHLÁRIK

This makes the assumption that an entirely lowercase surname should be capitalized. It would be easy to change that behavior.

答案 3 :(得分:1)

  $string = "McBain";
  preg_match('/([A-Z][a-z]+\h*)$/', $string, $matches);
  /** 
   Added qualifier for if no match found
   **/
  if(!empty($matches[1])){
      // $upperString = str_replace($matches[1], strtoupper($matches[1]),$string);
      // replace only last occurance of string:
      $pos = strrpos($string, $matches[1]);
     if($pos !== false)
         {
         $upperString = substr_replace($string, strtoupper($matches[1]), $pos, strlen($matches[1]));
          }
  }
  else {
      $upperString = strtoupper($string);
  }
  print $upperString;

示例输出:

$string = "McBain ";
$upperString = "McBAIN";

$string = "Mac Hartin";
$upperString = "Mac HARTIN";

$string = "Macaroni ";
$upperString = "MACARONI";

$string = "jacaroni";
$upperString = "JACARONI";

$string = "MacMac";
$upperString = "MacMAC";

(还在正则表达式中添加\h*以捕获任何空格。)

reference for find/replace last occurance

答案 4 :(得分:0)

<?php
$string = "MacArthur";
$count = 0;
$finished = "";
$chars = str_split($string);
foreach($chars as $char){
    if(ctype_upper($char)){
        $count++;
    }
        if($count == 2){
          $finished .= strtoupper($char); 
        }
         else{
          $finished .= $char;  
            } 
} 
echo $finished; 

答案 5 :(得分:0)

以下是将字符串中的最后一个大写字母后面的所有符号大写的代码。

preg_replace_callback('/[A-Z][^A-Z]+$/', function($match) {
  return strtoupper($match[0]);
}, $str);

尝试使用问题中的测试示例:https://repl.it/NYcR/5

答案 6 :(得分:0)

只是与其他答案不同,你可以尝试这样的事情。

$names = array(
    'MacArthur',
    'Macarthur',
    'ÜtaTest',
    'de Wit'
);
function fixSurnameA($item) {
$lname = mb_strtolower($item);
$nameArrayA = str_split($item,1);
$nameArrayB = str_split($lname,1);
$result = array_diff($nameArrayA, $nameArrayB);
$keys = array_keys($result);
$key = max($keys);
if(count($keys)>=2 or (count($keys)==1 and $key>0)) {
$pre = substr($item, 0, $key);
$suf = mb_strtoupper(substr($item, $key));
echo $pre.$suf."\n";
} else {
 echo $item."\n";
}
}
function fixSurnameB($item) {
$lname = mb_strtolower($item);
$nameArrayA = str_split($item,1);
$nameArrayB = str_split($lname,1);
$result = array_diff($nameArrayA, $nameArrayB);
$keys = array_keys($result);
$key = max($keys);
$pre = substr($item, 0, $key);
$suf = mb_strtoupper(substr($item, $key));
echo $pre.$suf."\n";
}

array_walk($names,'fixSurnameA');
/* MacARTHUR
   Macarthur
   ÜtaTEST
   de WIT 
*/
array_walk($names,'fixSurnameB');
/* MacARTHUR
   MACARTHUR
   ÜtaTEST
   de WIT 
*/

PHP SandBox

上测试