如何将python函数定义(以及其他任何内容)与RegEx相匹配?

时间:2013-03-01 12:37:16

标签: python regex

我正在尝试在Python中使用RegEx来解析函数定义而不是其他。我一直遇到问题。 RegEx是否适合在这里使用?

def foo():
  print bar
-- Matches --

a = 2
def foo():
  print bar
-- Doesn't match as there's code above the def --

def foo():
  print bar
a = 2
-- Doesn't match as there's code below the def --

我正在尝试解析的字符串示例是"def isPalindrome(x):\n return x == x[::-1]"。但实际上可能包含def本身之上或之下的行。

我必须使用什么RegEx表达式来实现这一目标?

2 个答案:

答案 0 :(得分:6)

不,正则表达式不适合这项工作。这类似于人们拼命试图用正则表达式解析HTML。这些语言不规律。因此,你不能解决你将遇到的所有怪癖。

使用内置的parser module,构建一个解析树,检查定义节点并改为使用它们。使用ast module会更好,因为它更方便使用。一个例子:

import ast

mdef = 'def foo(x): return 2*x'
a = ast.parse(mdef)
definitions = [n for n in ast.walk(a) if type(n) == ast.FunctionDef]

答案 1 :(得分:1)

reg = re.compile('((^ *)def \w+\(.*?\): *\r?\n'
                 '(?: *\r?\n)*'
                 '\\2( +)[^ ].*\r?\n'
                 '(?: *\r?\n)*'
                 '(\\2\\3.*\r?\n(?: *\r?\n)*)*)',
                 re.MULTILINE)

修改

import re
script = '''
def foo():
  print bar

a = 2
def foot():
  print bar

b = 10
"""
opopo =457
def foor(x):


  print bar
  print x + 10
  def g(u):
    print

  def h(rt,o):
    assert(rt==12)
a = 2
class AZERT(object):
   pass
"""


b = 10
def tabulae(x):


\tprint bar
\tprint x + 10
\tdef g(u):
\t\tprint

\tdef h(rt,o):
\t\tassert(rt==12)
a = 2


class Z:
    def inzide(x):


      print baracuda
      print x + 10
      def gululu(u):
        print

      def hortense(rt,o):
        assert(rt==12)



def oneline(x): return 2*x


def scroutchibi(h%,n():245sqfg srot b#

'''

reg = re.compile('((?:^[ \t]*)def \w+\(.*\): *(?=.*?[^ \t\n]).*\r?\n)'
                 '|'
                 '((^[ \t]*)def \w+\(.*\): *\r?\n'
                 '(?:[ \t]*\r?\n)*'
                 '\\3([ \t]+)[^ \t].*\r?\n'
                 '(?:[ \t]*\r?\n)*'
                 '(\\3\\4.*\r?\n(?: *\r?\n)*)*)',
                 re.MULTILINE)

regcom = re.compile('("""|\'\'\')(.+?)\\1',re.DOTALL)


avoided_spans = [ma.span(2) for ma in regcom.finditer(script)]

print 'eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee'
for ma in  reg.finditer(script):
    print ma.group(),
    print '--------------------'
    print repr(ma.group())
    print
    try:
        exec(ma.group().strip())
    except:
        print "   isn't a valid definition of a function"
    am,bm = ma.span()
    if any(a<=am<=bm<=b for a,b in avoided_spans):
        print '   is a commented definition function' 

    print 'eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee'

结果

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def foo():
  print bar

--------------------
'def foo():\n  print bar\n\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def foot():
  print bar

--------------------
'def foot():\n  print bar\n\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def foor(x):


  print bar
  print x + 10
  def g(u):
    print

  def h(rt,o):
    assert(rt==12)
--------------------
'def foor(x):\n\n\n  print bar\n  print x + 10\n  def g(u):\n    print\n\n  def h(rt,o):\n    assert(rt==12)\n'

   is a commented definition function
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def tabulae(x):


    print bar
    print x + 10
    def g(u):
        print

    def h(rt,o):
        assert(rt==12)
--------------------
'def tabulae(x):\n\n\n\tprint bar\n\tprint x + 10\n\tdef g(u):\n\t\tprint\n\n\tdef h(rt,o):\n\t\tassert(rt==12)\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
    def inzide(x):


      print baracuda
      print x + 10
      def gululu(u):
        print

      def hortense(rt,o):
        assert(rt==12)



--------------------
'    def inzide(x):\n\n\n      print baracuda\n      print x + 10\n      def gululu(u):\n        print\n\n      def hortense(rt,o):\n        assert(rt==12)\n\n\n\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def oneline(x): return 2*x
--------------------
'def oneline(x): return 2*x\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def scroutchibi(h%,n():245sqfg srot b#
--------------------
'def scroutchibi(h%,n():245sqfg srot b#\n'

   isn't a valid definition of a function
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee