我需要Javascript中的正则表达式。我有一个字符串:
'*window.some1.some\.2.(a.b + ")" ? cc\.c : d.n [a.b, cc\.c]).some\.3.(this.o.p ? ".mike." [ff\.]).some5'
我想用句点分割这个字符串,以便得到一个数组:
[
'*window',
'some1',
'some\.2', //ignore the . because it's escaped
'(a.b ? cc\.c : d.n [a.b, cc\.c])', //ignore everything inside ()
'some\.3',
'(this.o.p ? ".mike." [ff\.])',
'some5'
]
正则表达式会做什么?
答案 0 :(得分:7)
var string = '*window.some1.some\\.2.(a.b + ")" ? cc\\.c : d.n [a.b, cc\\.c]).some\\.3.(this.o.p ? ".mike." [ff\\.]).some5';
var pattern = /(?:\((?:(['"])\)\1|[^)]+?)+\)+|\\\.|[^.]+?)+/g;
var result = string.match(pattern);
result = Array.apply(null, result); //Convert RegExp match to an Array
小提琴:http://jsfiddle.net/66Zfh/3/
RegExp的解释。匹配一组连续的字符,满足:
/ Start of RegExp literal
(?: Create a group without reference (example: say, group A)
\( `(` character
(?: Create a group without reference (example: say, group B)
(['"]) ONE `'` OR `"`, group 1, referable through `\1` (inside RE)
\) `)` character
\1 The character as matched at group 1, either `'` or `"`
| OR
[^)]+? Any non-`)` character, at least once (see below)
)+ End of group (B). Let this group occur at least once
| OR
\\\. `\.` (escaped backslash and dot, because they're special chars)
| OR
[^.]+? Any non-`.` character, at least once (see below)
)+ End of group (A). Let this group occur at least once
/g "End of RegExp, global flag"
/*Summary: Match everything which is not satisfying the split-by-dot
condition as specified by the OP*/
+
和+?
之间存在差异。单个加号尝试匹配尽可能多的字符,而+?
仅匹配获得RegExp匹配所必需的这些字符。 示例:123 using \d+? > 1 and \d+ > 123
。
由于String.match
全局标志,/g
方法执行全局匹配。带有match
标志的g
函数返回一个由所有匹配子序列组成的数组。
省略g
标志时,仅选择第一个匹配。然后,该数组将包含以下元素:
Index 0: <Whole match>
Index 1: <Group 1>
答案 1 :(得分:3)
下面的正则表达式:
result = subject.match(/(?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))/g);
可用于获得所需的结果。第1组有结果,因为您要省略.
使用此:
var myregexp = /(?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))/g;
var match = myregexp.exec(subject);
while (match != null) {
for (var i = 0; i < match.length; i++) {
// matched text: match[i]
}
match = myregexp.exec(subject);
}
说明:
// (?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))
//
// Match the regular expression below «(?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))»
// Match the regular expression below and capture its match into backreference number 1 «(\(.*?[^'"]\)|.*?[^\\])»
// Match either the regular expression below (attempting the next alternative only if this one fails) «\(.*?[^'"]\)»
// Match the character “(” literally «\(»
// Match any single character that is not a line break character «.*?»
// Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match a single character NOT present in the list “'"” «[^'"]»
// Match the character “)” literally «\)»
// Or match regular expression number 2 below (the entire group fails if this one fails to match) «.*?[^\\]»
// Match any single character that is not a line break character «.*?»
// Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match any character that is NOT a “A \ character” «[^\\]»
// Match the regular expression below «(?:\.|$)»
// Match either the regular expression below (attempting the next alternative only if this one fails) «\.»
// Match the character “.” literally «\.»
// Or match regular expression number 2 below (the entire group fails if this one fails to match) «$»
// Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
答案 2 :(得分:2)
使用正则表达式进行平衡括号匹配是非常困难的,尤其是在Javascript中。
创建自己的解析器会更好。这是一个聪明的方法,将利用正则表达式的力量:
/(?:(\\.)|([\(\[\{])|([\)\]\}])|(\.))/g
string.replace(pattern, function (...))
,在功能中,保持打开括号和关闭括号的次数。这个解决方案需要一些工作,需要了解闭包,你应该看到string.replace
的文档,但我认为这是解决问题的好方法!
<强>更新强>:
在注意到与此相关的问题数量后,我决定接受上述挑战
Here is the live code to use a Regex to split a string。
此代码具有以下功能:
\
此代码适用于您的示例。
答案 3 :(得分:0)
这项工作不需要正则表达式。
var s = '*window.some1.some\.2.(a.b + ")" ? cc\.c : d.n [a.b, cc\.c]).some\.3.(this.o.p ? ".mike." [ff\.]).some5';
console.log(s.match(/(?:\([^\)]+\)|.*?\.)/g));
输出:
["*window.", "some1.", "some.", "2.", "(a.b + ")", "" ? cc.", "c : d.", "n [a.", "b, cc.", "c]).", "some.", "3.", "(this.o.p ? ".mike." [ff.])", "."]
答案 4 :(得分:0)
所以,正在使用它,现在我发现@FailedDev并非失败,因为那是非常好的。 :)
无论如何,这是我的解决方案。我只会发布正则表达式。
((\(.*?((?<!")\)(?!")))|((\\\.)|([^.]))+)
可悲的是,这对你的情况不起作用,因为我使用负面的lookbehind,我认为javascript正则表达式引擎不支持。它应该在其他引擎中按预期工作,但可以在此处确认:http://gskinner.com/RegExr/。替换为$ 1 \ n。