在2个定界符之间分割字符串并将其包括在内

时间:2020-05-23 18:03:18

标签: javascript regex

提供以下字符串...

"Here is my very _special string_ with {different} types of _delimiters_ that might even {repeat a few times}."

...如何使用2个定界符(“ _”,“ {和}”)将其拆分为数组,又如何将定界符保留在数组的每个元素中?

目标是:

[
  "Here is my very ", 
  "_special string_", 
  " with ", 
  "{different}", 
  " types of ", 
  "_delimiters_", 
  "that might even ", 
  "{repeat a few times}", 
  "."
]

我最好的选择是:

let myText = "Here is my very _special string_ with {different} types of _delimiters_ that might even {repeat a few times}."

console.log(myText.split(/(?=_|{|})/g))

如您所见,它无法复制所需的数组。

4 个答案:

答案 0 :(得分:5)

您可以使用

s.split(/(_[^_]*_|{[^{}]*})/).filter(Boolean)

请参见regex demo。整个模式包含在一个捕获组中,因此所有匹配的子字符串都包含在String#split之后的结果数组中。

正则表达式详细信息

  • (_[^_]*_|{[^{}]*})-捕获组1:
    • _[^_]*_-_,除了_以外的0个或更多字符,然后是_
    • |-或
    • {[^{}]*}-一个{,然后是{}之外的0个或多个字符,然后是}

请参阅JS演示

var s = "Here is my very _special string_ with {different} types of _delimiters_ that might even {repeat a few times}.";
console.log(s.split(/(_[^_]*_|{[^{}]*})/).filter(Boolean));

答案 1 :(得分:4)

您可以使用以下正则表达式返回一些undefined值,最后可以过滤undefined值。

let myText = "Here is my very _special string_ with {different} types of _delimiters_ that might even {repeat a few times}."

console.log(myText.split(/(_.+?_)|({.+?})/g).filter(Boolean))
.as-console-wrapper { max-height: 100% !important; top: 0; }

答案 2 :(得分:1)

您可以在以下正则表达式(使用filter(Boolean))的(零宽度)匹配项上拆分字符串。

(?<= )(?=[_{])|(?<=[_}])(?=\W)

请注意,此正则表达式不包含捕获组。

Demo

Javascript的正则表达式引擎执行以下操作。

(?<= )     preceding character is a space         (positive lookbehind)
(?=[_{])   following character is '_' or '{'      (positive lookahead) 
|          or 
(?<=[_}])  preceding character is '_' or '}'      (positive lookbehind)
(?=\W)     following character is a non-word char (positive lookahead)

(?=\W)可能例如与空格或标点符号匹配。

我假设括号和下划线不是嵌套的(例如,字符串将不包含_a {b} c_{a _b_ c})。

答案 3 :(得分:-1)

我不喜欢正则表达式,因为我不理解这种东西。 ?

所以我的解决方案是用#_替换blank_,依此类推,然后用#分割。