正则表达式只匹配列表中的几个元素

时间:2013-04-23 12:01:06

标签: python regex python-2.6

这是一个python-code-snippet:

import re

VARS='Variables: "OUTPUTFOLDER=installers","SETUP_ORDER=Product 4,Product 4  Library","SUB_CONTENTS=Product 4 Library","SUB_CONTENT_SIZES=9364256","SUB_CONTENT_GROUPS=Product 4 Library","SUB_CONTENT_DESCRIPTIONS=","SUB_CONTENT_GROUP_DESCRIPTIONS=","SUB_DISCS=Product 4,Product Disc",SUB_FILENAMES='
comp = re.findall(r'\w+=".*?"', VARS)

for var in comp:
    print var

这是目前的输出:

SUB_CONTENT_DESCRIPTIONS="," 
SUB_CONTENT_GROUP_DESCRIPTIONS=","

但是我希望输出提取所有元素,所以它看起来像这样:

"OUTPUTFOLDER=installers"
"SETUP_ORDER=Product 4, Product 4 Library"
"SUB_CONTENTS=Product 4"
"SUB_CONTENT_SIZES=9364256"
...

我的正则表达式模式有什么问题?

3 个答案:

答案 0 :(得分:1)

使用此正则表达式。

comp = re.findall(r'"\w+=.*?"', VARS)

结果:

>>> 
"OUTPUTFOLDER=installers"
"SETUP_ORDER=Product 4,Product 4  Library"
"SUB_CONTENTS=Product 4 Library"
"SUB_CONTENT_SIZES=9364256"
"SUB_CONTENT_GROUPS=Product 4 Library"
"SUB_CONTENT_DESCRIPTIONS="
"SUB_CONTENT_GROUP_DESCRIPTIONS="
"SUB_DISCS=Product 4,Product Disc"

在我看来,你可以用更聪明的方式做到这一点,并将你的“变种”存储在字典中。

d = dict(var.strip('"').split('=') for var in re.findall(r'"\w+=.*?"', VARS))

要查看字典:

for k, v in d.items():
    print k, '=', (v if v else '<NONE>')

结果:

>>> 
SETUP_ORDER = Product 4,Product 4  Library
SUB_CONTENT_DESCRIPTIONS = <NONE>
SUB_DISCS = Product 4,Product Disc
SUB_CONTENT_GROUPS = Product 4 Library
SUB_CONTENT_SIZES = 9364256
SUB_CONTENT_GROUP_DESCRIPTIONS = <NONE>
OUTPUTFOLDER = installers
SUB_CONTENTS = Product 4 Library

答案 1 :(得分:1)

使用此正则表达式:

r'"\w+?=.*?"'

我和你的正则表达之间的区别,请亲自看看:

r'"\w+?=.*?"' # mine
r'\w+=".*?"' # your's

只有一个"

输出:

>>> regex = re.compile(r'"\w+?=.*?"')
>>> regex.findall(string)
[u'"OUTPUTFOLDER=installers"', u'"SETUP_ORDER=Product 4,Product 4 Library"',  
 u'"SUB_CONTENTS=Product 4 Library"', u'"SUB_CONTENT_SIZES=9364256"',
 u'"SUB_CONTENT_GROUPS=Product 4 Library"', u'"SUB_DISCS=Product 4,Product Disc"']

答案 2 :(得分:1)

你可以试试这个:

 comp = re.findall(r'"(.*?)"', VARS)
 print [x for x in comp]

粗略地说,你会以非贪婪的方式得到双引号内的任何内容。