寻找一种方法来获取另一个单词后的单词,Python

时间:2012-06-09 19:33:46

标签: python string

如果它不是引用的(通过单引号或双引号或任何三引号)并且拼写正确(不能得到类d()),如何创建一个函数来获取“class”之后的单词

"class hi()"  > hi

"class hi(dff)"  > hi

"class hi   (  dff  )  :"  > hi 

"  class        hi       (  dff  )  :"  > hi 

"class hi"  > hi

"classf hi"  > Nothing

"fclass hi"  > Nothing

"'class hi(dd)'"  > Nothing

'"class hi(dd)"'  > Nothing

"'''class hi(dd)'''"  > Nothing

'"""class hi(dd)"""'  > Nothing

'"""\n\n\n\nclass hi(dd)\n\n\n\n"""'  > Nothing    

"'class' hi()"  > Nothing

创建使用循环太难了。如果有人能提供帮助那就太好了,谢谢。这非常具有挑战性,

4 个答案:

答案 0 :(得分:4)

这样的事,也许?

from StringIO import StringIO
from tokenize import generate_tokens
from token import NAME

def classname(s):
    g = generate_tokens(StringIO(s).readline)   # tokenize the string
    it = iter(g)
    for toknum, tokval, _, _, _  in it:
        if (toknum == NAME and tokval == 'class'):
            return it.next()[1]

print classname("class hi(29):")

答案 1 :(得分:3)

import re

def remove(reg, s, multiline=False):
    flags = [re.M, re.M | re.DOTALL][multiline]
    s,num = re.subn(reg, "", s, flags=flags)
    return s

def classname(s):
    s = remove("\"\"\".*?\"\"\"", s, multiline=True)
    s = remove("\'\'\'.*?\'\'\'", s, multiline=True)
    s = remove("\".*?\"", s)
    s = remove("\'.*?\'", s)

    res = re.search("(^|\s)class\s+(\w+)", s, flags=re.M)
    # print "*** {} -> {}".format(s, res.groups() if res else None)
    if res is None:
        return None
    else:
        return res.group(2)

我想使用\ b代替(^ | \ s),但它似乎不想工作?

我还整理了以下测试代码:

tests = [
    ("class hi()", "hi"),
    ("class hi(dff)", "hi"),
    ("class hi   (  dff  )  :", "hi"),
    ("  class        hi       (  dff  )  :", "hi"),
    ("class hi", "hi"),
    ("classf hi", None),
    ("fclass hi", None),
    ("'class hi(dd)'", None),
    ('"class hi(dd)"', None),
    ("'''class hi(dd)'''", None),
    ('"""class hi(dd)"""', None),
    ('"""\n\n\n\nclass hi(dd)\n\n\n\n"""', None),   
    ("'class' hi()", None),
    ("a = ''; class hi(object): pass", "hi")
]

def run_tests(fn, tests=tests):
    for inp,outp in tests:
        res = fn(inp)
        if res == outp:
            print("passed")
        else:
            print("FAILED on {} (gave '{}', should be '{}')".format(inp, repr(res), repr(outp)))

答案 2 :(得分:2)

使用正则表达式:

pattern = re.compile(r"\s*class\s+(\w+)")

例如:

>>> line_to_test = "  class        hi       (  dff  )  :" 
>>> match = pattern.match(line_to_test)
>>> match
<org.python.modules.sre.MatchObject object at 0x3>
>>> match.groups()
('hi',)

答案 3 :(得分:0)

  1. 删除括在引号中的所有子字符串(例如'"'''""")。
  2. 使用正则表达式匹配表达式“class(此处的类名称)”。
  3. 您可能需要调整正则表达式以正确匹配类名称的所有有效Python标识符:

    import re
    m = re.match("class ([\w]+)", "class hi")
    print m.group(0)