我正在尝试用不在括号内的逗号分隔字符串(即字符串包含用逗号分隔的项目,但它也包含括号内的逗号,我不想分开)。像这样:
A='[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]'
哪个应该导致:
['[1, "A"]', ' [2, "B"]', ' [3, "C"]', ' [4, "D"]', ' [5, "E"]', ' [6, "F"]', ' [7, "G"]', ' [8, "H"]', ' [9, "I"]', ' [10, "J"]', '[100, "JJ"]']
我尝试使用负面的lookbehind:
B=re.split(r'(?<![[][\d]),',A)
但是,当括号内的数字超过一位数时,例如[10,“J”]的情况下,这不起作用。任何帮助将不胜感激!
答案 0 :(得分:1)
这看起来像&#34;分隔在任何前面有]
&#34; 的逗号可以正常工作。为了更好的衡量,我添加了\s*
来占用下一个项目之前的空格。
import re
A = '[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]'
re.split(r"(?<=]),\s*", A)
给出
['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']
答案 1 :(得分:1)
你可以试试这个:
A='[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]'
import re
data = re.split('(?<=\]),\s', A)
输出:
['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']
答案 2 :(得分:0)
如果不要求使用split
,findall
也可以使用非常简单的表达式,
In [27]: re.findall(r'\[.+?\]', A)
Out[27]:
['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']
答案 3 :(得分:0)
答案 4 :(得分:0)
使用较新的regex
module,您可以使用
\[[^][]*\](*SKIP)(*FAIL) # discard anything in square brackets
| # or
,\s* # match , and whitespaces, eventually
<小时/> 在
Python
中,这看起来像
import regex as re
A='[1, "A"], [2, "B"], [3, "C"], [4, "D"], [5, "E"], [6, "F"], [7, "G"], [8, "H"], [9, "I"], [10, "J"], [100, "JJ"]'
rx = re.compile(r'\[[^][]*\](*SKIP)(*FAIL)|,\s*')
print(rx.split(A))
# ['[1, "A"]', '[2, "B"]', '[3, "C"]', '[4, "D"]', '[5, "E"]', '[6, "F"]', '[7, "G"]', '[8, "H"]', '[9, "I"]', '[10, "J"]', '[100, "JJ"]']