我有以下代码:
import re
vars='"NAME=Product","TYPE=","VERSION=1.2","VISIBILITY=","SRC=","FOLDER=TRUE","Text=foo, containing, commas"'
list = re.findall(r'\w+=".*?"', vars)
for i in range(1,len(list)):
print list[i]
输出如下:
VISIBILITY=","
SRC=","
但是我想找到(拆分)所有参数,即使是那些包含逗号的参数。 输出应该是这样的:
"NAME=Product"
"TYPE="
"VERSION=1.2"
"VISIBILITY="
"SRC="
"FOLDER=TRUE"
"Text=foo, containing, commas"
我有什么要改变我的正则表达式?
答案 0 :(得分:4)
您的正则表达式似乎只是期望=
等号右边的值附近的引号,但您的输入在整个表达式周围都有引号。
调整很简单:
re.findall(r'"\w+=.*?"', vars)
在您的示例输入上运行该命令:
>>> re.findall(r'"\w+=.*?"', vars)
['"NAME=Product"', '"TYPE="', '"VERSION=1.2"', '"VISIBILITY="', '"SRC="', '"FOLDER=TRUE"', '"Text=foo, containing, commas"']
>>> for match in re.findall(r'"\w+=.*?"', vars):
... print match
...
"NAME=Product"
"TYPE="
"VERSION=1.2"
"VISIBILITY="
"SRC="
"FOLDER=TRUE"
"Text=foo, containing, commas"
答案 1 :(得分:3)
我不确定你需要正则表达式:
[i for i in vars.split('"') if i not in ',']
出:
['NAME=Product',
'TYPE=',
'VERSION=1.2',
'VISIBILITY=',
'SRC=',
'FOLDER=TRUE',
'Text=foo, containing, commas']
答案 2 :(得分:1)
我猜这更接近你真正想要的东西:
list = re.findall(r'"(\w+)=(.*?)"', VARS)
答案 3 :(得分:1)
您可以使用CSV执行此操作:
import csv
vars='"NAME=Product","TYPE=","VERSION=1.2","VISIBILITY=","SRC=","FOLDER=TRUE","Text=foo, containing, commas"'
reader=csv.reader(vars,delimiter=",",quotechar='"')
print [''.join(tgt) for tgt in reader if ''.join(tgt)]
打印:
['NAME=Product', 'TYPE=',
'VERSION=1.2',
'VISIBILITY=',
'SRC=',
'FOLDER=TRUE',
'Text=foo, containing, commas']
答案 4 :(得分:0)
import re
vari=('"NAME=Product",'
'"TYPE=","VERSION=1.2",'
'"VISIBILITY=","SRC=","FOLDER=TRUE",'
'"Text=foo, containing, commas"')
print '\n'.join(re.findall('"[^"=]+=[^"=]*"', vari))