for line in fp
和for line in fp.readlines()
之间的区别是什么?
with open(filename, 'r') as fp :
for line in fp.readlines() :
#AND
with open(filename, 'r') as fp :
for line in fp :
答案 0 :(得分:4)
file.readlines()
“[读取]和[返回]来自流的行列表。”所以你得到的是每行的列表。因此,整个文件被读入内存,然后分成几行。
文档已经说明了这一点:
请注意,已经可以使用
for line in file: ...
迭代文件对象,而无需调用file.readlines()
。
因此,除非您确实需要将所有行作为列表,否则请勿使用readlines
。而是直接遍历文件,因为IOBase是所有文件处理程序的基本类型,它实现了迭代器协议:
IOBase
(及其子类)支持迭代器协议,这意味着可以迭代IOBase
对象,从而产生流中的行。根据流是二进制流(产生字节)还是文本流(产生字符串),行的定义略有不同。请参阅下面的readline()
。
使用迭代器协议的好处是文件不会完全读入内存。相反,文件流将被迭代使用,并为您提供一行接一行,而不会在内存中包含该文件的所有其他内容。因此,即使是非常大的文件也能很好地工作。
答案 1 :(得分:3)
fp
- is the file object itself , you can iterate over them to get the lines in the file.
Example -
>>> f = open('test.csv','r')
>>> f
<_io.TextIOWrapper name='test.csv' mode='r' encoding='cp1252'>
You can only iterate over them , you cannot access a certain line in the file directly without using seek()
or such function.
fp.readlines()
- this returns the list of all lines in the file, when you iterate over this , you are iterating over the list of lines.
Example -
>>> f = open('test.csv','r')
>>> lines = f.readlines()
>>> lines
['order_number,sku,options\n', '500,GK-01,black\n', '499,GK-05,black\n', ',,silver\n', ',,orange\n', ',,black\n', ',,blue']
Here , you can get the 2nd line in the file using lines[1]
, etc.
Usually if the requirement is to just iterate over the lines in the file, its better to use the file
directly, since creating a list of lines and then iterating over them would cause unnecessary overhead.