正则表达式,用于获取忽略中间名的名和姓

时间:2015-05-09 13:41:17

标签: javascript regex

我正在搜索一个正则表达式,它可以给我一个完整名称的字符串中的名字和姓氏。

我搜索过,但我找不到符合我需求的东西。例如:

  • Abc Def Ghi Jkl ---> Abc Jkl
  • AécDefGàiMkl---> AécMkl
  • Aéc-DefGàiMkl---> Aéc-Def Mkl
  • AécDefGài-Mkl ---> AécGài-Mkl
  • Afd ---> AFD

当字符串在左侧时,如何构建正则表达式以返回右侧的内容?

5 个答案:

答案 0 :(得分:2)

对于您有不同字符的特定情况,您必须稍微更改正则表达式以满足您的需要,这是一个可以实现您需要的:

^([\w-éà]+)[^\w-éà].*?[^\w-éà]([\w-éà]+)$|^([\w-éà]+)$

在regex101.com上测试:

enter image description here

说明:

我们必须将正则表达式分成两部分,以便更容易理解:

^([\w-éà]+)[^\w-éà].*?[^\w-éà]([\w-éà]+)$

这是您至少有两个名字的一般情况。

块[\ w-éà]代表你的角色集。

然后使用起始锚点(^)告诉引擎您在行的开头寻找匹配项,然后您获得一个包含您的字符集的组,直到您找到不在您的角色中的内容集([^ \ W-EA])。然后你使用懒惰的量词。*?匹配第一个出现的下一个模式,即匹配一个单词到结束锚($)。

第二部分只是一个单词的情况(^([\ w-éà] +)$)

在此示例中,当至少有两个名称时,组1将具有名字

当至少有两个名字时,第2组将具有姓氏

当只有一个名字时,

和第3组将具有名称

答案 1 :(得分:1)

虽然我不建议使用正则表达式,但使用String.prototype.split()Array.prototype.shift()Array.prototype.forEach()的内容似乎更容易:

function firstAndLast(el) {
  // getting the text of the element:
  var haystack = el.textContent,
    // splitting that text on white-space sequences,
    // forming an array:
    names = haystack.split(/\s+/),
    // getting the first element of that array:
    first = names.shift(),
    // initialising the 'last' variable to an empty string:
    last = '';
  // if the names array has a length greater than 1
  // (there is more than one name):
  if (names.length > 1) {
    // last is assigned the last element of the array of names:
    last = names.pop();
  }

  // return an array containing the first and last names:
  return [first, last];
}

// getting all the <li> elements in the document:
var listItems = document.querySelectorAll('li'),
  // creating an empty <span> element:
  span = document.createElement('span'),
  // an unitialised variable for use within the loop:
  clone;

// iterating over each of the <li> elements, using
// Array.prototype.forEach(), and Function.prototype.call():
Array.prototype.forEach.call(listItems, function(li) {
  // cloning the created <span>:
  clone = span.cloneNode();
  // setting the clone's text to the joined-together
  // strings from the Array returned by the function:
  clone.textContent = firstAndLast(li).join(' ');
  // appending that cloned created-<span> to the
  // current <li> element over which we're iterating:
  li.appendChild(clone);
});

function firstAndLast(el) {
  var haystack = el.textContent,
    names = haystack.split(/\s+/),
    first = names.shift(),
    last = '';
  if (names.length > 1) {
    last = names.pop();
  }

  return [first, last];
}

var listItems = document.querySelectorAll('li'),
  span = document.createElement('span'),
  clone;

Array.prototype.forEach.call(listItems, function(li) {
  clone = span.cloneNode();
  clone.textContent = firstAndLast(li).join(' ');
  li.appendChild(clone);
});
li span::before {
  content: ' found: ';
  color: #999;
}
li span {
  color: #f90;
  width: 5em;
}
<ol>
  <li>Abc Def Ghi Jkl</li>
  <li>Aéc Def Gài Mkl</li>
  <li>Aéc-Def Gài Mkl</li>
  <li>Aéc Def Gài-Mkl</li>
  <li>Afd</li>
</ol>

JS Fiddle demo

可以使用正则表达式,只是不必要地更复杂:

function firstAndLast(el) {
  var haystack = el.textContent,
    // matching a case-insensitive sequence of characters at the
    // start of the string (^), that are in the range a-z,
    // unicode accented characters, an apostrophe or
    // a hyphen (escaped with a back-slash because the '-'
    // character has a special meaning within regular
    // expressions, indicating a range, as above) followed
    // by a word-boundary (\b):
    first = haystack.match(/^[a-z\u00C0-\u017F'\-]+\b/i),

    // as above but the word-boundary precedes the string of
    // of characters, and it matches a sequence at the end
    // of the string ($):
    last = haystack.match(/\b[a-z\u00C0-\u017F'\-]+$/i);

  // if first exists (no matching regular expression would
  // would return null) and it has a length:
  if (first && first.length) {
    // we assign the first element of the array returned by
    // String.prototype.match() to the 'first' variable:
    first = first[0];
  }
  if (last && last.length) {
    // as above:
    last = last[0];
  }

  // if the first and last variables are exactly equal,
  // we return only the first; otherwise we return both
  // first and last, in both cases within an array:
  return first === last ? [first] : [first, last];
}

function firstAndLast(el) {
  var haystack = el.textContent,
    first = haystack.match(/^[a-z\u00C0-\u017F'\-]+\b/i),
    last = haystack.match(/\b[a-z\u00C0-\u017F'\-]+$/i);
  if (first && first.length) {
    first = first[0];
  }
  if (last && last.length) {
    last = last[0];
  }
  return first === last ? [first] : [first, last];
}

var listItems = document.querySelectorAll('li'),
  span = document.createElement('span'),
  clone;

Array.prototype.forEach.call(listItems, function(li) {
  clone = span.cloneNode();
  clone.textContent = firstAndLast(li).join(' ');
  li.appendChild(clone);
});
li span::before {
  content: ' found: ';
  color: #999;
}
li span {
  color: #f90;
  width: 5em;
}
<ol>
  <li>Abc Def Ghi Jkl</li>
  <li>Aéc Def Gài Mkl</li>
  <li>Aéc-Def Gài Mkl</li>
  <li>Aéc Def Gài-Mkl</li>
  <li>Afd</li>
</ol>

JS Fiddle demo

参考文献:

答案 2 :(得分:0)

我会使用^来匹配输入的开头,然后使用括号(),特殊\w字符和+字符来捕获名字。然后是可选的空格/字符,后跟更多的括号,以便在输入结束之前捕获姓氏,该名称由特殊的$字符匹配。这是一个例子:

var huge = 'Abc Def Ghi Jkl';
var small = 'Afd';

var regex = /^(\w+).*?(\w*)$/;
var results = regex.exec(huge);

console.log(results[1]); // 'Abc'
console.log(results[2]); // 'Jkl'

var results = regex.exec(small);

console.log(results[1]); // 'Afd'

有很多方法可以做你想做的事情,所以我建议你阅读this page

答案 3 :(得分:0)

如果您只为正则表达式传递一个全名,请使用此名称来获取名字和姓氏 /^[^ \n]+|[^ \n]+$/g,如果您传递了由每个全名之间的一行分隔的所有全名的列表,请使用此/^[^ \n]+|[^ \n]+$/gm只需在正则表达式的末尾添加m,然后使用此链接进行测试regex to get first and last name from a full name

答案 4 :(得分:0)

请记住,结构良好的正则表达式应该尽可能多地涵盖当前现有示例中的例外情况 - 此外它应该以一种方式设计,以便将来轻松扩展!在JS中,您可以尝试以下Regex:

var re = /^(\w+(-\w+)? ?)((.* )(?!$))?(\w+(-\w+)?)$/;
var strLong = "Abc_Def-John with a Really really_LongName";
var newstrLong = strLong.replace(re, "$1$5");
console.log(newstrLong);

var strShort = "simplyJohn";
var newstrShort = strShort.replace(re, "$1$5");
console.log(newstrShort);