我在尝试解析表达式时遇到了一些问题,如下所示:
word1, word2[a,b,c], word3, ..., wordN
我想获得以下群组:
g1: word1
g2: word2[a,b,c]
g3: word3
请注意[。+]是可选的,正则表达式必须能够匹配以下表达式:
word1,word2,word3
word1[a,b,c],word2,word3
word1[a,b,c],word2[e,f,g],word3
word1[a,b,c],word2[e,f,g],word3[i,j,l]
我做了一些尝试,但我找不到正确分开小组的方法。
答案 0 :(得分:1)
我在https://regex101.com上尝试了这个正则表达式,并将表达式粘贴到“测试字符串”框中。
/^([a-zA-Z0-9]+(?:\[.*\])?),([a-zA-Z0-9]+(?:\[.*\])?),([a-zA-Z0-9]+(?:\[.*\])?)$/gm
每个单词用逗号分隔,形式为:
([a-zA-Z0-9]+(?:\[.*\])?)
说明:
(
[a-zA-Z0-9]+ # one or more alphanumeric characters (could use \w)
(?:\[.*\])? # an optional sequence surrounded by []s. (?: ) means a non-capturing group
)
答案 1 :(得分:1)
暂时这似乎有效:
import re
rgx = re.compile("(\w+(\[.*?\])*).*?,?")
[key for key, val in rgx.findall("word1, word2[a,b,[c,,,]], word,3")]
# this regex starts by looking for alpha numberic characters with \w+
# then within that it looks if a `[` is present then till we encounter end of bracket ']' consider everything (\[.*?\])*.
# the output of this is a tuple as ('word2[a,b,c]', '[a,b,c]')
# we iterate over the tuple and take only the 1st values in the tuple
输出:
['word1', 'word2[a,b,[c,,,]', 'word', '3']
另一个例子
[key for key, val in rgx.findall("word1[bbbb,cccc],word2[bbbb,cccc] ")]
输出:
['word1[bbbb,cccc]', 'word2[bbbb,cccc]']
PS:还在改善它。
答案 2 :(得分:1)
您可以使用jquery-3.1.1.js
仅在逗号之外进行拆分,这些逗号位于括号之外。这可以通过以下事实来确定:这些逗号在开始之前永远不会与结束括号匹配(使用否定前瞻)。只有非嵌套括号才能使用此技巧。
<link rel="stylesheet" href="<?php echo base_url(); ?>bootstrap-3.3.7/dist/css/bootstrap-iso.css">
<script type="text/javascript" src="<?php echo base_url(); ?>js/jquery-3.1.1.js"></script>
<script type="text/javascript" src="<?php echo base_url(); ?>bootstrap-3.3.7/dist/js/bootstrap.min.js"></script>
输出re.split