Question

我必须将大量数据写入带有成千上万行和列的大制表符分隔文件中。什么是更好的方法：

在开头使用with open(outfile,"w") as x:，然后在计算完每个长行后将其写入文件。
计算每一行并在计算完成后立即追加它，并在每一行后再次为每一行调用with open(outfile, "a") as x:并关闭文件。

PS：with open对open的内存使用情况有什么不利吗？

Answer 1

一次又一次地重新打开同一个文件显然会花费更多时间：

bruno@bigb:~/Work/playground$ python opentest.py
each : 
11.1244959831
once : 
0.124312162399
bruno@bigb:~/Work/playground$ cat opentest.py

def each(data):
    for whatever in data:
        with open("opentest-each.dat", "a") as f:
            f.write(whatever)

def once(data):
    with open("opentest-once.dat", "a") as f:
        for whatever in data:
            f.write(whatever)

def main():
    import timeit

    t1 = timeit.Timer("each(data)", "from opentest import each; data=map(str, range(10000))")
    print "each : "
    print t1.timeit(100)

    t2 = timeit.Timer("once(data)", "from opentest import once; data=map(str, range(10000))")
    print "once : "
    print t2.timeit(100)

if __name__ == "__main__":
    main()

wrt /内存使用情况，使用with open(...)不应该有任何显着差异（如果它有任何差异）。

现在请注意，如果您的代码是作为命令行脚本运行的，那么最好的解决方案是写入sys.stdout并使用您的shell将stdout重定向到文件。

编写大文件：打开一次，或者为每个写作事件重新打开？

1 个答案: