Question

我有一个Perl正则表达式（显示here，虽然理解整个事情并不是必须回答这个问题），其中包含\ G元字符。我想将它翻译成Python，但Python似乎不支持\ G.我该怎么办？

Answer 1

试试这些：

import re
re.sub()
re.findall()
re.finditer()

例如：

# Finds all words of length 3 or 4
s = "the quick brown fox jumped over the lazy dogs."
print re.findall(r'\b\w{3,4}\b', s)

# prints ['the','fox','over','the','lazy','dogs']

Answer 2

您可以使用re.match来匹配锚定模式。 re.match仅匹配文本的开头（位置0）或您指定的位置。

def match_sequence(pattern,text,pos=0):
  pat = re.compile(pattern)
  match = pat.match(text,pos)
  while match:
    yield match
    if match.end() == pos:
      break # infinite loop otherwise
    pos = match.end()
    match = pat.match(text,pos)

这只会匹配给定位置的模式，以及跟随0个字符的任何匹配。

>>> for match in match_sequence(r'[^\W\d]+|\d+',"he11o world!"):
...   print match.group()
...
he
11
o

Answer 3

Python的regexen没有/ g修饰符，因此没有\ G regex令牌。真可惜，真的。

Answer 4

我知道我迟到了，但这里是\G方法的替代方案：

import re

def replace(match):
    if match.group(0)[0] == '/': return match.group(0)
    else: return '<' + match.group(0) + '>'

source = '''http://a.com http://b.com
//http://etc.'''

pattern = re.compile(r'(?m)^//.*$|http://\S+')
result = re.sub(pattern, replace, source)
print(result)

输出（通过Ideone）：

<http://a.com> <http://b.com>
//http://etc.

这个想法是使用匹配两种字符串的正则表达式：URL或注释行。然后使用回调（委托，闭包，嵌入代码等）来找出匹配的那个并返回相应的替换字符串。

事实上，即使是支持\G的风格，这也是我的首选方法。即使在Java中，我也必须编写一堆样板代码来实现回调。

（我不是一个Python人，所以请原谅我，如果代码完全不是pythonic。）

Answer 5

不要试图将所有内容都放在一个表达式中，因为它变得非常难以阅读，翻译（如您自己所见）和维护。

import re
lines = [re.sub(r'http://[^\s]+', r'<\g<0>>', line) for line in text_block.splitlines() if not line.startedwith('//')]
print '\n'.join(lines)

当你从Perl翻译时，Python通常不是最好的，它有自己的编程模式。

Python正则表达式是否支持像Perl的\ G？

5 个答案: