Question

例如：

20190108JPYUSDabced.csv
20190107JPYUSDabced.csv
20190106JPYUSDabced.csv

当我从终端搜索前两个文件时：

bash: ls /Users/Downloads/201901{08,07}JPYUSDabced.csv
it gives me the first 2 files (exclude 20190106JPYUSDabced.csv)

当我在python中操作时：

import glob
glob.glob('/Users/Downloads/201901{08,07}JPYUSDabced.csv')
it gives me []

Answer 1

根据glob模块的文档，幕后glob使用fnmatch.fnmatch。 fnmatch文档描述的唯一模式是：

Pattern   |    Meaning
--------- | -----------------------------
*         | matches everything 
?         | matches any single character 
[seq]     | matches any character in seq 
[!seq]    | matches any character not in seq 
对于文字匹配，请将元字符括在方括号中。例如，“ [？]”与字符“？”匹配。

尝试使用括号中的字符序列：

glob.glob('/Users/Downloads/2019010[87]JPYUSDabced.csv')

使用os.walk

假设您要搜索特定的日期范围，则可能需要尝试将os.walk与re正则表达式结合使用，以获取要查找的更复杂的模式。

注意事项： os.walk从开始位置递归遍历每个目录，这可能不是您想要的。

无论情况如何，您都必须调整正则表达式，但这是一个示例：

正则表达式匹配日期20181208或日期20190107，但必须包含标识符JPYUSDabced.csv。

regex = re.compile("(?:(?:20181208)|(?:20190107))JPYUSDabced.csv")

files = []
for dirpath, dirnames, filenames in os.walk('/Users/Downloads'):
    for f in filenames:
        if regex.match(f):
            files.append(os.path.join(dirpath, f))
print(files)
# ['/Users/Downloads/20190107JPYUSDabced.csv', '/Users/Downloads/20181208JPYUSDabced.csv']

为什么python glob.glob无法通过传入的正则表达式给我想要的文件？

1 个答案:

使用os.walk