我有一个字符串进入func,可能是:
I am a "somevalue"
或
I am a "somevalue" of "anothervalue"
在每种情况下,我都需要识别'我是'部分,然后在引号内返回值,如果有两个则返回两者。有几种方法可以做到这一点,但我正在寻找最高效率的高效率。
有兴趣听取任何有此意见的人的回复 - 谢谢!
答案 0 :(得分:2)
由于您的格式是常量,因此您可以在同一个正则表达式中进行匹配和捕获。
var str1 = 'I am a "somevalue" of "anothervalue"',
str2 = 'I am a "somevalue"',
str3 = 'I am a "value with \\"escaped\\" quotes"',
regex = /^I am a "((?:\\"|[^"])*)"(?: of "((?:\\"|[^"])*)")?/;
function match(str) {
var matches = str.match(regex);
if (matches !== null) {
console.log(matches.slice(1)); // ["somevalue", "anothervalue"]
}
}
match(str1); // ["somevalue", "anothervalue"]
match(str2); // ["somevalue", undefined]
match(str3); // ["value with \"escaped\" quotes", undefined]
切片调用是删除包含整个字符串的第一个匹配项。如果没有什么可匹配的话,你会得到'未定义'作为第二场比赛。
根据引号内引号的转发方式,您可能需要稍微修改正则表达式。我假设\
将成为逃脱角色。
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
I am a " 'I am a "'
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[^"] any character except: '"'
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
of " ' of "'
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\\ '\'
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[^"] any character except: '"'
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
" '"'
--------------------------------------------------------------------------------
)? end of grouping
尽管如此,这种解决方案仅适用于“高使用率”的特定值。如果我们以非常高的速度讨论数百万个查询,那么使用更适合解析文本的技术会更好(而且可能不会在JavaScript /节点中)。