您好我正在尝试解析包含类似于下面给出的文件路径的脚本。我想使用正则表达式解析文件并将数据存储到一个字符串中,文件之间有'\ n'分隔。下面给出了示例文件。
SAMPLE FILE: ('#' is a comment would like to keep commented out)
add file -tls "../path1/path2/path3/example_1.edf"
add file -tls "../path1/path2/path3/example_1.v"
add file -tls "../path1/path2/path3/exa_4mple_1.sv"
add file -tls "../path1/path2/path3/example_1.vh"
#add file -tls "../path1/path2/path3/exa_0mple_1.vhd"
SAMPLE OUTPUT: (this example excludes the '\n' character)
example_1.v
exa_4mple_1.sv
example_1.vh
#exa_0mple_1.vhd
如何构建扩展“重新”以使其仅包含上述扩展并排除其他扩展?我也想知道是否有可能为注释掉的行捕获'#'并在文件名前添加'#'。
-Desired function:
for match in re.finditer(r'/([A-Za-z0-9_]+\..+)"', contents):
mylist.append(match.group(1))
-Working Code: ( tested on the '.v' file case )
re.finditer(r'/([A-Za-z0-9_]+\.v)"', contents)
答案 0 :(得分:1)
不需要正则表达式:
>>> import os
>>> L = [
... "/path1/path2/path3/example_1.edf",
... "/path1/path2/path3/example_1.v",
... "/path1/path2/path3/exa_4mple_1.sv",
... "/path1/path2/path3/example_1.vh" ]
>>> for mypath in L:
... if mypath.split('.')[-1] in ('v', 'sv', 'vh'):
... print os.path.split(mypath)[1]
...
example_1.v
exa_4mple_1.sv
example_1.vh
或者作为列表理解:
>>> [os.path.split(mypath)[1]
... for mypath in L
... if mypath.split('.')[-1] in ('v', 'sv', 'vh')]
['example_1.v', 'exa_4mple_1.sv', 'example_1.vh']
答案 1 :(得分:1)
import re
contents = '''
add file -tls "../path1/path2/path3/example_1.edf"
add file -tls "../path1/path2/path3/example_1.v"
add file -tls "../path1/path2/path3/exa_4mple_1.sv"
add file -tls "../path1/path2/path3/example_1.vh"
#add file -tls "../path1/path2/path3/exa_0mple_1.vhd"
'''
print contents
pat = "^(#?)add file.+?\"\.\./(?:\w+/)*(\w+?\.\w*v\w*)\"\s*$"
gen = (''.join(mat.groups())
for mat in re.finditer(pat,contents,re.MULTILINE))
print '\n'.join(gen)
该模式允许捕获包含字母'v'的扩展名的路径,这是我从您的问题中理解的。
根据你的例子,我还把字符串add file
作为捕捉的标准
我在模式中使用了\w
使用此模式,所有路径都应以../
为开头
如果所有这些特征都不适合您的情况,我们将改变需要改变的内容。
请注意,我将\s*
放在模式的末尾,以防路径后面的行中有空格。