'in fp'和'in fp.readlines()'之间有什么区别?

时间:2015-06-26 16:02:10

标签: python

for line in fpfor line in fp.readlines()之间的区别是什么?

with open(filename, 'r') as fp :
    for line in fp.readlines() :

#AND

with open(filename, 'r') as fp :
    for line in fp :

2 个答案:

答案 0 :(得分:4)

file.readlines() “[读取]和[返回]来自流的行列表。”所以你得到的是每行的列表。因此,整个文件被读入内存,然后分成几行。

文档已经说明了这一点:

  

请注意,已经可以使用for line in file: ...迭代文件对象,而无需调用file.readlines()

因此,除非您确实需要将所有行作为列表,否则请勿使用readlines。而是直接遍历文件,因为IOBase是所有文件处理程序的基本类型,它实现了迭代器协议:

  

IOBase(及其子类)支持迭代器协议,这意味着可以迭代IOBase对象,从而产生流中的行。根据流是二进制流(产生字节)还是文本流(产生字符串),行的定义略有不同。请参阅下面的readline()

使用迭代器协议的好处是文件不会完全读入内存。相反,文件流将被迭代使用,并为您提供一行接一行,而不会在内存中包含该文件的所有其他内容。因此,即使是非常大的文件也能很好地工作。

答案 1 :(得分:3)

fp - is the file object itself , you can iterate over them to get the lines in the file.

Example -

>>> f = open('test.csv','r')
>>> f
<_io.TextIOWrapper name='test.csv' mode='r' encoding='cp1252'>

You can only iterate over them , you cannot access a certain line in the file directly without using seek() or such function.

fp.readlines() - this returns the list of all lines in the file, when you iterate over this , you are iterating over the list of lines.

Example -

>>> f = open('test.csv','r')
>>> lines = f.readlines()
>>> lines
['order_number,sku,options\n', '500,GK-01,black\n', '499,GK-05,black\n', ',,silver\n', ',,orange\n', ',,black\n', ',,blue']

Here , you can get the 2nd line in the file using lines[1] , etc.

Usually if the requirement is to just iterate over the lines in the file, its better to use the file directly, since creating a list of lines and then iterating over them would cause unnecessary overhead.