Question

我刚刚编写了我的第一个Python程序，它的工作原理！这是一个程序，将字幕文件重命名为匹配的视频文件，以便媒体播放器将拾取字幕（例如，如果名为“The .Office.S03E01.str”的文件将重命名为“TheOffice.S3E01.str”。 Office.S03E01.avi“出现在目录中。”

我喜欢有人评论/批评我的代码并帮助我改进编码风格并使其更具Python-y。您可以在http://subtitle-renamer.googlecode.com/hg/renombrador.py

找到代码

正如我所说，这是我在Python中的第一个程序，所以请随意评论：

样式（缩进，变量名称，约定等）
设计
我应该使用的Python功能我不是
我对图书馆的使用

谢谢！

Answer 1

1）哈希爆炸线只有在文件的第一行才有效。

2）函数和变量名通常是underscore_separated而不是camelCase。

3）文档字符串通常使用三引号，即使对于单行文档字符串也是如此。

4）你的编码风格非常实用，有许多lambda，map，reduce等，以及四行三重嵌套列表理解的一个实例。我发现很难理解这种风格，并且肯定会展开其中的一些风格。

5）作为（4）可能导致的问题的一个例子，由于列表理解结构，你有几个地方不止一次地评估同一个表达式：

match = [regex.match(str) for regex in episodeRegExes if regex.match(str)]

然后

def getEpisodeTuple(iterable):
    episodeTuple = [episodeChunk(chunk)
        for chunk in iterable
        if episodeChunk(chunk)]
    if episodeTuple:
        assert len(episodeTuple) == 1
        return episodeTuple[0]
    else:
        return None

请注意，此处的列表推导会对episodeChunk（chunk）进行两次评估，并且每次都会执行两次正则表达式匹配，因此在您的成功案例中，您将匹配正则表达式四次。

在最后一段代码中，（a）你已经将列表称为元组，（b）你构建了一个列表，然后断言它只有一个元素，并返回该元素。它会更简单：

def getEpisodeTuple(iterable):
    for chunk in iterable:
        echunk = episodeChunk(chunk)
        if echunk:
            return echunk

6）了解有关标准库的更多信息。例如，此代码：

def splitWithAny(string, delimiters):
    "Splits the string with any of the strings of delimiters"
    return reduce(
        lambda iterable, delim: reduce(
            lambda lst,chunk: lst + chunk.split(delim),
            iterable,
            []),
        delimiters,
        [string])

def splitName(fileName):
    "Splits the fileName into smaller and hopefully significant chunks"
    delimiters = [" ", ".", "_", "-"]
    return filter(None, splitWithAny(fileName, delimiters))

我认为（我不确定，因为减少（lambda（reduce（lambda））））东西......）可以简化为：

def splitName(fileName):
    return re.split("[ ._-]+", fileName)

Answer 2

如果你对pep8进行检查，你会发现一些风格建议（可以使用自动pep8检查器）。

另外，你可以使用像pyLint这样的东西非常有帮助。我的Python代码的质量通过使用它们得到了很大的改进:)（我已经在我选择的编辑器中设置了热键，并且发现它们非常方便经常使用）。

Answer 3

夫妻评论：

同意Ned关于docstrings的评论。
如果您在一个地方使用Regex，为什么不使用它来处理字符串拆分呢？
这令人困惑：

return [
        (getEpisodeTuple(chunks),
        [chunk for chunk in chunks if not episodeChunk(chunk)])
        for chunks in [splitName(string) for string in stringiterable]]

因为你将一个班轮的逻辑划分为多个部分但不清楚。如果你认为分离它们是正确的事情，那么真的要将它们分开。你不是通过不将它分成多行而停止生成内部列表。请记住，如果没有比loooooong方法更多的解密，那么loooong一个衬垫也同样困难。

看看通过将其转换为：

splitNames = [splitName(string) for string in stringiterable]
chunkChunks = lambda chunks: [chunk for chunk in chunks if not episodeChunk(chunk)]

return [(getEpisodeTuple(chunks),chunkChunks(chunks) for chunks in splitNames]

Answer 4

#!/usr/bin/env python应位于第一行
使用[docstrings] [1]进行多行注释和功能描述。
包括一般描述和代码，因此我们不必猜测它是什么。
避免使用一行函数，对结构更具描述性。

我的第一个Python程序！有人关心审查它有助于我提高吗？

4 个答案: