我正在尝试使用正则表达式正确提取由com接口为com接口生成的方法定义。此外,其中一些是空白的,这给我带来了更多的问题。
基本上我有这个:
IXMLSerializerAlt._methods_ = [
COMMETHOD([helpstring(u'Loads an object from an XML string.')], HRESULT, 'LoadFromString',
( ['in'], BSTR, 'XML' ),
( ['in'], BSTR, 'TypeName' ),
( ['in'], BSTR, 'TypeNamespaceURI' ),
( ['retval', 'out'], POINTER(POINTER(IUnknown)), 'obj' )),
]
class EnvironmentManager(CoClass):
u'Singleton object that manages different environments (collections of configuration information).'
_reg_clsid_ = GUID('{8A626D49-5F5E-47D9-9463-0B802E9C4167}')
_idlflags_ = []
_typelib_path_ = typelib_path
_reg_typelib_ = ('{5E1F7BC3-67C5-4AEE-8EC6-C4B73AAC42ED}', 1, 0)
INumberFormat._methods_ = [
]
我想提取IXMLSerializerAlt和INumberFormat方法定义但是我不能找出一个正确的正则表达式。例如。对于IXMLSerializer,我想提取这个:
IXMLSerializerAlt._methods_ = [
COMMETHOD([helpstring(u'Loads an object from an XML string.')], HRESULT, 'LoadFromString',
( ['in'], BSTR, 'XML' ),
( ['in'], BSTR, 'TypeName' ),
( ['in'], BSTR, 'TypeNamespaceURI' ),
( ['retval', 'out'], POINTER(POINTER(IUnknown)), 'obj' )),
]
这个正则表达式在我看来应该有效:
^\w+\._methods_\s=\s\[$
(^.+$)*
^]$
我正在使用kodos检查我的正则表达式,但我无法找到一种方法来使这项工作。
答案 0 :(得分:2)
您错过了$
和^
之间的换行符,并且可能没有使用re.MULTILINE
标志,该标志允许这些字符在行的开头和结尾处锚定。以下(使用re.MULTILINE
编译)将匹配:
\w+\._methods_\s=\s\[$(?:\n^.+$)*\n^\]$
但是,这里有一个稍微简化的正则表达式,它也符合你的例子:
>>> s = '''...\nIXMLSerializerAlt._methods_ = [\n COMMETHOD([helpstring(u'Loads an object from an XML string.')], HRESULT, 'LoadFromString',\n ( ['in'], BSTR, 'XML' ),\n ( ['in'], BSTR, 'TypeName' ),\n ( ['in'], BSTR, 'TypeNamespaceURI' ),\n ( ['retval', 'out'], POINTER(POINTER(IUnknown)), 'obj' )),\n]\n...'''
>>> import re
>>> re.findall(r'^\w+\._methods_\s=\s\[$.*?^\]$', s, re.DOTALL | re.MULTILINE)
["IXMLSerializerAlt._methods_ = [\n COMMETHOD([helpstring(u'Loads an object from an XML string.')], HRESULT, 'LoadFromString',\n ( ['in'], BSTR, 'XML' ),\n ( ['in'], BSTR, 'TypeName' ),\n ( ['in'], BSTR, 'TypeNamespaceURI' ),\n ( ['retval', 'out'], POINTER(POINTER(IUnknown)), 'obj' )),\n]"]
答案 1 :(得分:0)
import re
interface_definitions = '''
IXMLSerializerAlt._methods_ = [
COMMETHOD([helpstring(u'Loads an object from an XML string.')], HRESULT, 'LoadFromString',
( ['in'], BSTR, 'XML' ),
( ['in'], BSTR, 'TypeName' ),
( ['in'], BSTR, 'TypeNamespaceURI' ),
( ['retval', 'out'], POINTER(POINTER(IUnknown)), 'obj' )),
]
class EnvironmentManager(CoClass):
u'Singleton object that manages different environments (collections of configuration information).'
_reg_clsid_ = GUID('{8A626D49-5F5E-47D9-9463-0B802E9C4167}')
_idlflags_ = []
_typelib_path_ = typelib_path
_reg_typelib_ = ('{5E1F7BC3-67C5-4AEE-8EC6-C4B73AAC42ED}', 1, 0)
INumberFormat._methods_ = [
]
'''
RX_METHODS = re.compile(
r'(\w+)\._methods_\s=\s\[('
r'.*?'
r'(?:\[.*?\].*?)*'
r')\]',
re.DOTALL)
for match in RX_METHODS.finditer(interface_definitions):
print match.groups()