Question

我有一个简单的轮询器类（下面的代码片段），它根据正则表达式从多个文件夹中检索文件。我尝试捕获OSError异常并忽略它们，因为文件可能被移出/删除/权限等... 在一些测试中（我在其中创建/删除了大量的文件）我注意到在排序生成器时，在生成器函数（_get）中引发的异常被重新引发（？），我不得不使用额外的尝试，除了阻止解决这个问题。

知道为什么会这样吗？所有意见/改进都赞赏！

由于 Timmah

NSURL(string: editedVideoPath)

编辑：感谢@ShadowRanger指出作为sortkey param传递的os.path函数。

Answer 1

为后代发布答案：根据psychic intuition（和confirmation in the comments），self._sortkey正在尝试stat正在排序的文件。虽然对目录具有读取权限足以获取其中包含的文件名，但如果您缺少对这些文件的读取权限，则您将无法stat这些文件。

由于sorted正在生成器范围外执行key函数，因此生成器中没有任何内容引发异常，因此它无法捕获它。您需要预先过滤/预先计算每个文件的stat值（并删除不能stat编辑的文件），然后对其进行排序，然后删除（不再相关）stat数据。例如：

from operator import itemgetter

def with_key(filenames, key):
    '''Generates computed_key, filename pairs

    Silently filters out files where the key function raises OSError
    '''
    for f in filenames:
        try:
            yield key(f), f
        except OSError:
            pass

# ... skipping to the `sorted` call in get ...
# Replace the existing sorted call with:
# map(itemgetter(1), strips the key, yielding only the file name
files = map(itemgetter(1),
            sorted(
                   # Use with_key to filter and decorate filenames with sortkey
                   with_key(self._get(maxitems), self._sortkey),
                   # Use key=itemgetter(0) so only sortkey is considered for
                   # sorting (making sort stable, instead of performing fallback
                   # comparison between filenames when key is the same)
                   key=itemgetter(0), reverse=self._sortreverse))

它基本上是手动执行Schwartzian Transform（又名＆＃34;装饰 - 排序 - 未装饰＆＃34;）。通常情况下，Python key / sorted的{{1}}参数会隐藏这种复杂性，但在这种情况下，由于可能存在异常，因此需要删除项目一个人出现并希望通过使用EAFP模式来最小化竞争条件），你必须自己完成这项工作。

使用Python 3.5（或使用第三方`list.sort`包的2.6-2.7和3.2-3.4）的替代解决方案：

你可以避免这个问题（并且在Windows上，只要目录是可读的，并且在类似Windows的文件系统中，在目录条目中缓存文件元数据，在输出中包含不可读的文件）如果你愿意的话复杂性降低，性能可能更好。 Windows上的scandir（或3.5之前，os.scandir）可以获取目录条目中缓存的scandir.scandir信息＆＃34;免费＆＃34; （您只需在目录中每千个条目支付一次RTT费用，而不是每个文件支付一次），在Linux上，第一次调用stat会缓存DirEntry.stat数据，因此在{{1}中执行此操作这意味着您可以在那里捕获并处理stat，填充缓存以便在排序期间_get可以使用缓存数据而不会有OSError的风险。所以你可以这样做：

self._sortkey

这需要稍微改变用法; OSError必须在os.DirEntry instance上运行，而不是文件路径。因此，您可能拥有try: from os import scandir except ImportError: from scandir import scandir # Prestat will ensure OSErrors raised in _get, not in caller using DirEntry def _get(self, maxitems=0, prestat=True, follow_symlinks=True): def customfilter(f): if self._exclude is not None and self._exclude.search(f): return False return self._regex is None or self._regex.search(f) count = 0 for p in self.paths: if not os.path.isdir(p): raise PollException("'%s' is not a valid path." % (p,), p) if maxitems and count >= maxitems: break try: # Use scandir over listdir, and since we get DirEntrys, we # don't need to explicitly use os.path.join to make full paths # and we can use genexpr for validation instead for dirent in (de for de in scandir(p) if customfilter(de.name) and self._validate(de.path)): # On Windows, stat() is cheap noop (returns precomputed data) # except symlink w/follow_symlinks=True (where it stats and caches) # On Linux, this will force a stat now, and cache the result # so OSErrors will only be raised here, not during sorting if prestat: dirent.stat(follow_symlinks=follow_symlinks) if maxitems and count >= maxitems: break count += 1 yield dirent except OSError: ''' There will be instances where we wont have permission on the file/directory or when a file is moved/deleted before it was yielded. ''' continue def get(self, maxitems=0): # Prestat if we have a sortkey (assuming it may use stat data) files = self._get(maxitems, prestat=self._sortkey is not None) if self._sortkey is not None: # self._sortkey must now operate on a os.DirEntry # but no more need to wrap in try/except OSError files = sorted(files, key=self._sortkey, reverse=self._sortreverse) # To preserve observable public behaviors, return path, not DirEntry for dirent in files: yield dirent.path而不是self._sortkey。

但它避免了手动Schwartzian变换的复杂性（因为访问冲突只能在self._sortkey = kwargs.get('sortkey', os.path.getmtime) self._sortkey = kwargs.get('sortkey', lambda de: de.stat().st_mtime) / _get中发生，只要您不这样做更改try，因此在except计算期间不会发生OSErrors。通过懒惰地迭代目录而不是在迭代之前构建完整的prestat（除非目录很大，确实是一个小的好处）并且无需使用key系统调用，它也可能运行得更快。对于Windows上的大多数目录条目都是如此。

python生成器列出异常重新引发

1 个答案:

使用Python 3.5（或使用第三方`list.sort`包的2.6-2.7和3.2-3.4）的替代解决方案：

python生成器列出异常重新引发

1 个答案:

使用Python 3.5（或使用第三方list.sort包的2.6-2.7和3.2-3.4）的替代解决方案：

使用Python 3.5（或使用第三方`list.sort`包的2.6-2.7和3.2-3.4）的替代解决方案：