我要匹配以下模式。
RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TextTransApplied:RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TagTransAttempted:(8), ASYNC_JAVASCRIPT(61);TagTransFailed:ASYNC_JAVASCRIPT(42);TagTransApplied:(8), ASYNC_JAVASCRIPT(19);
我在python中有如下正则表达式。
for ele in re.findall("[A-Z]+_[A-Z]+\(\d+\)",str(feed)):
print ele
但是这与JAVASCRIPT_HTML5_CACHE不匹配。
如何指定由' _'分隔的多个单词?并且可以包含数字?
答案 0 :(得分:4)
您可以使用以下正则表达式。
[A-Z]+(?:_[A-Z\d]+)+\(\d+\)
+
重复前一个标记一次或多次。 [A-Z\d]+
匹配一个或多个大写字母或数字。
>>> import re
>>> s = "RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TextTransApplied:RENAME_JAVASCRIPT(18), RENAME_IMAGE(7), MINIFY_JAVASCRIPT(26), (1), JAVASCRIPT_HTML5_CACHE(19), EMBED_JAVASCRIPT(1), RENAME_CSS(3), (1), IMAGE_COMPRESSION(7), RESPONSIVE_IMAGES(6), ASYNC_JAVASCRIPT(2);TagTransAttempted:(8), ASYNC_JAVASCRIPT(61);TagTransFailed:ASYNC_JAVASCRIPT(42);TagTransApplied:(8), ASYNC_JAVASCRIPT(19);"
>>> for i in re.findall(r'[A-Z]+(?:_[A-Z\d]+)+\(\d+\)', s):
... print(i)
RENAME_JAVASCRIPT(18)
RENAME_IMAGE(7)
MINIFY_JAVASCRIPT(26)
JAVASCRIPT_HTML5_CACHE(19)
EMBED_JAVASCRIPT(1)
RENAME_CSS(3)
IMAGE_COMPRESSION(7)
RESPONSIVE_IMAGES(6)
ASYNC_JAVASCRIPT(2)
RENAME_JAVASCRIPT(18)
RENAME_IMAGE(7)
MINIFY_JAVASCRIPT(26)
JAVASCRIPT_HTML5_CACHE(19)
EMBED_JAVASCRIPT(1)
RENAME_CSS(3)
IMAGE_COMPRESSION(7)
RESPONSIVE_IMAGES(6)
ASYNC_JAVASCRIPT(2)
ASYNC_JAVASCRIPT(61)
ASYNC_JAVASCRIPT(42)
ASYNC_JAVASCRIPT(19)
>>>
答案 1 :(得分:0)
试试这个
[A-Z]+_[A-Z]+\(\d+\)|[^,]+(?<=\s)J+[^)]+\)