我使用Python来计算文件的行数。一种方法是对文件对象使用python迭代器,另一种方法是将一些字节读入缓冲区并检查'\ n'的计数。但是我发现第一种方法比第二种方法快得多。
在我的示例中,readline()花费8.6s,readbytes花费60.7s。
我想知道为什么会这样吗?
import time
def readline():
start_time = time.time()
count = 0
with open("bigfile.txt", encoding="utf-8") as fin:
for line in fin:
count += 1
print(count)
print(time.time() - start_time)
def readbytes():
start_time = time.time()
count = 0
with open("bigfile.txt", encoding="utf-8") as fin:
while True:
buffer = fin.read(1024)
if not buffer:
break
for ch in buffer:
if ch == "\n":
count += 1
print(count)
print(time.time() - start_time)