用正则表达式拆分字符串与数字问题

时间:2019-02-20 16:01:15

标签: javascript node.js regex

具有以下字符串列表:

Client Potential XSS2Medium
Client HTML5 Insecure Storage41Medium
Client Potential DOM Open Redirect12Low

我想将每个字符串分成三个字符串,例如:

["Client Potential XSS", "2", "Medium"]

我使用以下正则表达式:

/[a-zA-Z ]+|[0-9]+/g)

但是对于包含其他数字的字符串,显然不起作用。例如:

Client HTML5 Insecure Storage41Medium

结果是:

["Client HTML", "5", " Insercure Storage", "41", "Medium"]

我找不到产生的正则表达式:

["Client HTML5 Insercure Storage", "41", "Medium"]

此正则表达式可在regex101.com上使用:

(.+[ \t][A-z]+)+([0-9]+)+([A-z]+)

在我的代码中使用它:

data.substring(startIndex, endIndex)
        .split("\r\n") // Split the vulnerabilities
        .filter(item => !item.match(/(-+)Page \([0-9]+\) Break(-+)/g) // Remove page break
          && !item.match(/PAGE [0-9]+ OF [0-9]+/g) // Remove pagination
          && item !== '') // Remove blank strings
        .map(v => v.match(/(.+[ \t][A-z]+)+([0-9]+)+([A-z]+)/g));

不起作用。

任何帮助将不胜感激!

编辑: 所有字符串都以HighMediumLow结尾。

6 个答案:

答案 0 :(得分:2)

问题出在您的g全局标志上。

从以下行中删除该标志:.map(v => v.match(/(.+[ \t][A-z]+)+([0-9]+)+([A-z]+)/g));以使其成为

.map(v => v.match(/(.+[ \t][A-z]+)+([0-9]+)+([A-z]+)/));


此外,您可以简化正则表达式,例如@bhmahler的shown

.map(v => v.match(/(.*?)(\d+)(low|medium|high)/i));

答案 1 :(得分:1)

以下正则表达式将为您提供所需的内容。

/(.*?)(\d+)(low|medium|high)/gi

下面是一个示例https://regex101.com/r/AS9mvf/1

这里是使用地图的示例

var entries = [
  'Client Potential XSS2Medium',
  'Client HTML5 Insecure Storage41Medium',
  'Client Potential DOM Open Redirect12Low'
];

var matches = entries.map(v => {
  var result = /(.*?)(\d+)(low|medium|high)/gi.exec(v);
  return [
    result[1],
    result[2],
    result[3]
  ];
});

console.log(matches);

答案 2 :(得分:0)

您可以使用一种解决方法(先将匹配再捕获,然后替换):

let strings = ['Client Potential XSS2Medium', 'Client HTML5 Insecure Storage41Medium', 'Client Potential DOM Open Redirect12Low', 'Client HTML5 Insecure Storage41Medium'];

let regex = /(?:HTML5|or_other_string)|(\d+)/g;

strings.forEach(function(string) {
    string = string.replace(regex, function(match, g1) {
        if (typeof(g1) != "undefined") {
            return "#@#" + g1 + "#@#";
        }
        return match;
    });
    string = string.split("#@#");
    console.log(string);
});

另见demo on regex101.com

答案 3 :(得分:0)

在这里,您有一个解决方案,可以使用High使用自定义LowMediumtokenString.replace()之前的数字包装起来,最后将token生成的字符串:

const inputs = [
  "Client Potential XSS2High",
  "Client HTML5 Insecure Storage41Medium",
  "Client Potential DOM Open Redirect12Low"
];

let token = "-#-";
let regexp = /(\d+)(High|Low|Medium)$/;

let res = inputs.map(
    x => x.replace(regexp, `${token}$1${token}$2`).split(token)
);

console.log(res);

另一种解决方案是使用以下正则表达式:/^(.*?)(\d+)(High|Low|Medium)$/i

const inputs = [
  "Client Potential XSS2High",
  "Client HTML5 Insecure Storage41Medium",
  "Client Potential DOM Open Redirect12Low"
];

let regexp = /^(.*?)(\d+)(High|Low|Medium)$/i;

let res = inputs.map(
    x => x.match(regexp).slice(1)
);

console.log(res);

答案 4 :(得分:0)

let arr = ["Client Potential XSS2Medium",
"Client HTML5 Insecure Storage41Medium",
"Client Potential DOM Open Redirect12Low"];    

let re = /^.+[a-zA-Z](?=\d+)|\d+(?=[A-Z])|[^\d]+\w+$/g;


arr.forEach(str => console.log(str.match(re)))

^.+[a-zA-Z](?=\d+)匹配字符串的开头,后跟a-zA-Z,后跟一个或多个数字字符

\d+(?=[A-Z])匹配一个或多个数字字符,后跟大写字母字符

[^\d]+\w+$取反数字字符,然后匹配单词字符,直到字符串结尾

答案 5 :(得分:0)

const text = `Client Potential XSS2Medium
Client HTML5 Insecure Storage41Medium
Client Potential DOM Open Redirect12Low`


const res = text.split("\n").map(el =>  el.replace(/\d+/g, a => ' ' + a + ' ') );

console.log(res)