我正在迭代文件夹并逐行读取文件以找到一些子字符串。
问题是我想在搜索过程中忽略一些文件夹(例如bin
或build
)。
我尝试添加检查 - 但它仍然包含在搜索
中def step(ext, dirname, names):
for name in names:
if name.lower().endswith(".xml") or name.lower().endswith(".properties") or name.lower().endswith(".java"):
path = os.path.join(dirname, name)
if not "\bin\"" in path and not "\build" in path:
with open(path, "r") as lines:
print "File: {}".format(path)
i = 0
for line in lines:
m = re.search(r"\{[^:]*:[^\}]*\}", line)
with open("search-result.txt", "a") as output:
if m is not None:
output.write("Path: {0}; \n Line number: {1}; \n String: {2}\n".format(path, i, m.group()))
i+=1
答案 0 :(得分:2)
我怀疑你的情况是失败的,因为\b
被解释为退格字符,而不是反斜杠字符后跟一个“b”字符。而且,看起来你正在垃圾箱的末尾逃脱引号;这是故意的吗?
尝试转义反斜杠。
if not "\\bin\\" in path and not "\\build" in path:
答案 1 :(得分:1)
Kevins answer将完成这项工作,但我更愿意让os.path.split()
完成这项工作(所以,如果有人在不同的平台上使用该功能,它将起作用):
import os.path
def path_contains(path, folder):
while path:
path, fld = os.path.split(path)
# Nothing in folder anymore? Check beginning and quit
if not fld:
if path == folder:
return True
return False
if fld == folder:
return True
return False
>>> path_contains(r'C:\Windows\system32\calc.exe', 'system32')
True
>>> path_contains(r'C:\Windows\system32\calc.exe', 'windows')
False
>>> path_contains(r'C:\Windows\system32\calc.exe', 'Windows')
True
>>> path_contains(r'C:\Windows\system32\calc.exe', 'C:\\')
True
>>> path_contains(r'C:\Windows\system32\calc.exe', 'C:')
False
>>> path_contains(r'C:\Windows\system32\calc.exe', 'test')
False