Question

我尝试了以下代码。

import re

regobj = re.compile(r"^.+\.(oth|xyz)$")

for test in ["text.txt", "other.oth", "abc.xyz"]:
    if regobj.match(test):
        print("Method 1:", test)

regobj = re.compile(r"^.+\.[^txt]$")

for test in ["text.txt", "other.oth", "abc.xyz"]:
    if regobj.match(test):
        print("Method 2:", test)

我希望第二种方法找到任何没有扩展名txt的文件，但我尝试的方式不是好的。我做错了什么？

Answer 1

正则表达式在这里过度。使用str.endswith() method：

if not str.endswith('.txt'):

您的正则表达式使用负字符类，它是不应匹配的 set 字符。任何不是t或x的内容都将满足该测试。您可以明确匹配.txt并使用not 排除而不是包含：

regobj = re.compile(r"^.+\.txt$")

if not regobj.match(test):

如果您只能使用正则表达式，请使用否定先行断言;

 regobj = re.compile(r"^[^.]+\.(?!txt$)[^.]+$")

此处(?!...)仅匹配以下 no 文字txt的位置，一直到字符串末尾。然后[^.]+匹配任何数量的字符，这些字符不是.字符，直到字符串结尾。

Answer 2

将第二个正则表达式更改为，

regobj = re.compile(r"^.+\.(?!txt$)[^.]+$")

[^txt]匹配任何不属于t或x的字符。 (?!txt$)声明点不会被txt跟随。并且[^.]+之后的\.断言必须至少有一个字符必须存在于点之后。因此，这与具有任何扩展名但不包含.txt

的文件名匹配

Answer 3

正如Martijn Pieters提到的那样regex是过度的，考虑到还有其他更有效的方法：

fileName, fileExt = os.path.splitext(string)

使用splitext分隔扩展名很简单。

import os

fileDict = ["text.txt", "other.oth", "abc.xyz"]
matchExt = ".txt"

for eachFile in fileDict:
    fileName, fileExt = os.path.splitext(eachFile)
    if matchExt not in fileExt:
        print("(not %s) %s %s" % (matchExt, fileExt, fileName))

您可以轻松添加else语句以匹配其他扩展程序，我将留给您。

查找没有给定扩展名的文件

3 个答案: