Question

for line in fp和for line in fp.readlines()之间的区别是什么？

with open(filename, 'r') as fp :
    for line in fp.readlines() :

#AND

with open(filename, 'r') as fp :
    for line in fp :

Answer 1

file.readlines() “[读取]和[返回]来自流的行列表。”所以你得到的是每行的列表。因此，整个文件被读入内存，然后分成几行。

文档已经说明了这一点：

请注意，已经可以使用for line in file: ...迭代文件对象，而无需调用file.readlines()。

因此，除非您确实需要将所有行作为列表，否则请勿使用readlines。而是直接遍历文件，因为IOBase是所有文件处理程序的基本类型，它实现了迭代器协议：

IOBase（及其子类）支持迭代器协议，这意味着可以迭代IOBase对象，从而产生流中的行。根据流是二进制流（产生字节）还是文本流（产生字符串），行的定义略有不同。请参阅下面的readline()。

使用迭代器协议的好处是文件不会完全读入内存。相反，文件流将被迭代使用，并为您提供一行接一行，而不会在内存中包含该文件的所有其他内容。因此，即使是非常大的文件也能很好地工作。

Answer 2

fp - is the file object itself , you can iterate over them to get the lines in the file.

Example -

>>> f = open('test.csv','r')
>>> f
<_io.TextIOWrapper name='test.csv' mode='r' encoding='cp1252'>

You can only iterate over them , you cannot access a certain line in the file directly without using seek() or such function.

fp.readlines() - this returns the list of all lines in the file, when you iterate over this , you are iterating over the list of lines.

Example -

>>> f = open('test.csv','r')
>>> lines = f.readlines()
>>> lines
['order_number,sku,options\n', '500,GK-01,black\n', '499,GK-05,black\n', ',,silver\n', ',,orange\n', ',,black\n', ',,blue']

Here , you can get the 2nd line in the file using lines[1] , etc.

Usually if the requirement is to just iterate over the lines in the file, its better to use the file directly, since creating a list of lines and then iterating over them would cause unnecessary overhead.

'in fp'和'in fp.readlines（）'之间有什么区别？

2 个答案: