Question

我正在尝试打开一个文件，并计算文件中的行数。

我为此使用的代码是：

def line_Count(x):
    with open(x,'r') as iFile:  # Open passed in file
        lines = iFile.readlines() # Read each line in the file
        line_Count = len(lines) # Count the number of lines
        return line_Count       # Return the line count

这适用于少量数据（0.073秒内显示1万行）。

但是，对于大文件（100万行），需要15分钟以上才能完成。

是否有更快的方法来完成任务？

前面的示例来自5年前，并且自此以后，其中的一些解决方案已被弃用。

Answer 1

使用xreadlines（因为您要处理大文件）可能会增强Python2：

count = 0
for line in open(file_path).xreadlines(): count += 1

或者因为您正在使用Python 3，而使用生成器可能会减少内存占用：

count = sum(1 for i in open(file_path, 'rb'))

或

def blocks(files, size=65536):
    while True:
        b = files.read(size)
        if not b: break
        yield b

with open(file_path, "r",encoding="utf-8",errors='ignore') as f:
    print (sum(bl.count("\n") for bl in blocks(f)))

最后，您可以“作弊”并使用子过程：

int(subprocess.check_output("wc -l " + file_path).split()[0])

读取文件中行数的更快方法

1 个答案: