Question

我们在磁盘中有几个巨大的文件（大于RAM的大小）。我想在python中逐行读取它们并在终端输出结果。我已经完成了[1]和[2]，但我正在寻找不等到整个文件被读入内存的方法。

我会使用这两个命令：

cat fileName | python myScript1.py
python myScript2.py fileName

Answer 1

with open("myfile.txt", "r") as myfile:
    for line in myfile:
        # do something with the current line

或

for line in sys.stdin:
    # do something with the current line

Answer 2

迭代file：

with open('huge.file') as hf:
  for line in hf:
    if 'important' in line:
      print(line)

这将需要O（1）内存。

要从标准输入读取，只需迭代sys.stdin而不是hf：

import sys
for line in sys.stdin:
  if 'important' in line:
    print(line)

Answer 3

if __name__ == '__main__':
    while 1:
        try:
            a=raw_input()
        except EOFError:
            break
        print a

这将从stdin到EOF读取。要使用第二种方法读取文件，可以使用Tim的方法

即

with open("myfile.txt", "r") as myfile:
    for line in myfile:
        print line
        # do something with the current line