Question

我有一个解析100MB文件的程序，然后我对数据应用了一些函数。我没有实现检查瓶颈的功能......

所以我只是将我的实现放在评论中，然后放入pass

为什么python使用了这么多内存？

解析文件需要15分钟，我可以看到python使用3GB内存，CPU使用率为15％，内存使用率为70％。

是否适用该程序是否受到约束？

如何固定解析？或者没有什么可以解决缓慢的解析？

档案样本：年龄和薪水

50 1000
40 123
1233 123213

CODE：

def parse(pathToFile):
    myList = []
    with open(pathToFile) as f:
        for line in f:
            s = line.split()
            age, salary = [int(v) for v in s]
            Jemand = Mensch(age, salary)
            myList.append(Jemand)
    return myList

Answer 1

你的代码可以大大提高速度：

with open(pathToFile) as f:
    for line in f:
        s = line.split()
        age, salary = [int(v) for v in s]
        Jemand = Mensch(age, salary)
        myList.append(Jemand)

由于

，

很慢

循环
append
无用的列表comp转换为整数，分配给固定数量的值

它可能成为准单行：

with open(pathToFile) as f:
    myList = [Mensch(*(int(x) for x in line.split())) for line in f]

（使用列表链式列表理解和生成器理解，只要将参数传递给具有*解包的类）

Answer 2

您观察到的性能不佳可能是由Python垃圾收集器中的错误引起的。要解决此问题，请在构建列表时禁用垃圾回收，并在完成后将其打开。有关详细信息，请参阅this SO article

Python - 如何判断进程是否是i / o绑定的？

2 个答案: