有没有更好的方法来从字符串中提取信息?

时间:2019-01-02 03:09:37

标签: javascript arrays regex

让我说说我有一个字符串数组,我需要它们的特定信息,这是一种简单的方法吗?

假设数组是这样的:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

假设我要提取日期并将其保存到另一个数组中,那么我可以创建一个这样的函数

function extractDates(arr)
{
  let dateRegex = /(\d{1,2}\/){2}\d{4}/g, dates = "";
  let dateArr = [];

  for(let i = 0; i<arr.length; i++)
  {
    dates = /(\d{1,2}\/){2}\d{4}/g.exec(arr[i])
    dates.pop();
    dateArr.push(dates);
  }

  return dateArr.flat();
}

尽管这行得通,但它笨拙并且需要pop(),因为它将返回一个数组数组,即:["12/16/1988", "16/"],然后我需要随后调用flat()

另一种选择是在给定位置将字符串细分为字符串,在此我需要知道一个正则表达式模式。

function extractDates2(arr)
{
  let dates = [];

  for(let i = 0; i<arr.length; i++)
  {
    let begin = regexIndexOf(arr[i], /(\d{1,2}\/){2}\d{4}/g);
    let end = regexIndexOf(arr[i], /[0-9] /g, begin) + 1;
    dates.push(arr[i].substring(begin, end));
  }

  return dates;
 }    

当然,它使用下一个regexIndexOf()函数:

function regexIndexOf(str, regex, start = 0)
{
  let indexOf = str.substring(start).search(regex);
  indexOf = (indexOf >= 0) ? (indexOf + start) : -1;
  return indexOf;
}

同样,此功能也有效,但似乎无法完成简单内容的提取。有没有更简单的方法可以将数据提取到数组中?

4 个答案:

答案 0 :(得分:21)

一种方法可能是在数组的各个元素上使用map()并将匹配项应用于每个元素,最后调用flat()以获得所需的结果:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

const result = infoArr.map(o => o.match(/(\d{1,2}\/){2}\d{4}/g)).flat();

console.log(result);

或者,您可以使用flatMap()

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

const result = infoArr.flatMap(o => o.match(/(\d{1,2}\/){2}\d{4}/g));

console.log(result);

此外,如果在没有日期的字符串的情况下需要从最终数组中删除null值,则可以应用filter(),如下所示:

const result = infoArr.map(o => o.match(/(\d{1,2}\/){2}\d{4}/g))
                      .flat()
                      .filter(date => date !== null);

const result = infoArr.flatMap(o => o.match(/(\d{1,2}\/){2}\d{4}/g))
                      .filter(date => date !== null);

数据冲突的示例:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple 10/22/1922",
  "2 James Smith orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/19075 peach",
  "5 Doug Jones 11/10-1975 peach"
];

const result = infoArr.flatMap(o => o.match(/(\d{1,2}\/){2}\d{4}/g))
                      .filter(date => date !== null); /* or filter(date => date) */

console.log(result);

不带flat()的替代项:

由于flat()flatMap()目前仍是“实验性的”,可能会发生变化,并且某些浏览器(或版本)不支持它,因此您可以使用下一个替代方法,其限制为在每个string上获得第一场比赛:

const infoArr = [
  "1 Ben Howard 12/16/1988 apple 10/22/1922",
  "2 James Smith orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/19075 peach",
  "5 Doug Jones 11/10-1975 peach"
];

const getData = (input, regexp, filterNulls) =>
{
    let res = input.map(o =>
    {
        let matchs = o.match(regexp);
        return matchs && matchs[0];
    });

    return filterNulls ? res.filter(Boolean) : res;
}

console.log(getData(infoArr, /(\d{1,2}\/){2}\d{4}/g, false));
console.log(getData(infoArr, /(\d{1,2}\/){2}\d{4}/g, true));

答案 1 :(得分:18)

一种选择是通过不匹配的分隔符(例如class Students: def __init__ (self,name,age,grade): self.name=name self.age=age self.grade=grade student1=Students('Bob',12,'7th') student1.name student1.age student1.grade )将字符串连接起来,然后执行全局匹配以从中获取日期数组:

,

答案 2 :(得分:4)

  

尽管这行得通,但它笨拙并且需要pop(),因为它将返回一个数组数组,即:["12/16/1988", "16/"],然后我需要随后调用flat

正则表达式exec方法始终在0属性中具有它的匹配项(假设它完全匹配),您可以访问它并将其推送到您的数组中:

let infoArr = [
  "1 Ben Howard 12/16/1988 apple",
  "2 James Smith 1/10/1999 orange",
  "3 Andy Bloss 10/25/1956 apple",
  "4 Carrie Walters 8/20/1975 peach",
  "5 Doug Jones 11/10/1975 peach"
];

function extractDates(arr){
  const dateRegex = /(\d{1,2}\/){2}\d{4}/g;
  const dateArr = [];
  for (const str of arr){
    const date = /(\d{1,2}\/){2}\d{4}/g.exec(str);
    dateArr.push(date[0]);
  }
  return dateArr;
}

console.log(extractDates(infoArr));

(当然,您也可以在map回调中做同样的事情)

答案 3 :(得分:2)

您可以使用reduce()而不是循环来配对代码。如果没有匹配项,请小心将null保留在数组之外。

let infoArr = [
    "1 Ben Howard 12/16/1988 apple",
    "2 James Smith 1/10/1999 orange",
    "3 Andy Bloss 10/25/1956 apple",
    "4 Carrie Walters 8/20/1975 peach",
    "5 Doug Jones 11/10/1975 peach"
  ];
  
let regex = /(\d{1,2}\/){2}\d{4}/g
let dates =  infoArr.reduce((arr, s) => arr.concat(s.match(regex) || []) , [])
console.log(dates)