正则表达式匹配以下模式

时间:2014-12-23 05:30:35

标签: python regex

我要匹配以下模式。

RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TextTransApplied:RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TagTransAttempted:(8), ASYNC_JAVASCRIPT(61);TagTransFailed:ASYNC_JAVASCRIPT(42);TagTransApplied:(8), ASYNC_JAVASCRIPT(19);

我在python中有如下正则表达式。

for ele in re.findall("[A-Z]+_[A-Z]+\(\d+\)",str(feed)):
    print ele

但是这与JAVASCRIPT_HTML5_CACHE不匹配。

如何指定由' _'分隔的多个单词?并且可以包含数字?

2 个答案:

答案 0 :(得分:4)

您可以使用以下正则表达式。

[A-Z]+(?:_[A-Z\d]+)+\(\d+\)

+重复前一个标记一次或多次。 [A-Z\d]+匹配一个或多个大写字母或数字。

DEMO

>>> import re
>>> s = "RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TextTransApplied:RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TagTransAttempted:(8), ASYNC_JAVASCRIPT(61);TagTransFailed:ASYNC_JAVASCRIPT(42);TagTransApplied:(8), ASYNC_JAVASCRIPT(19);"
>>> for i in re.findall(r'[A-Z]+(?:_[A-Z\d]+)+\(\d+\)', s):
...     print(i)
RENAME_JAVASCRIPT(18)
RENAME_IMAGE(7)
MINIFY_JAVASCRIPT(26)
JAVASCRIPT_HTML5_CACHE(19)
EMBED_JAVASCRIPT(1)
RENAME_CSS(3)
IMAGE_COMPRESSION(7)
RESPONSIVE_IMAGES(6)
ASYNC_JAVASCRIPT(2)
RENAME_JAVASCRIPT(18)
RENAME_IMAGE(7)
MINIFY_JAVASCRIPT(26)
JAVASCRIPT_HTML5_CACHE(19)
EMBED_JAVASCRIPT(1)
RENAME_CSS(3)
IMAGE_COMPRESSION(7)
RESPONSIVE_IMAGES(6)
ASYNC_JAVASCRIPT(2)
ASYNC_JAVASCRIPT(61)
ASYNC_JAVASCRIPT(42)
ASYNC_JAVASCRIPT(19)
>>>

答案 1 :(得分:0)

试试这个

[A-Z]+_[A-Z]+\(\d+\)|[^,]+(?<=\s)J+[^)]+\)