Question

我正试图从第3行开始读取某些文件，但是我不能。

我尝试使用readlines() +行的索引号，如下所示：

x = 2
f = open('urls.txt', "r+").readlines( )[x]
line = next(f)
print(line)

但是我得到这个结果：

Traceback (most recent call last):
  File "test.py", line 441, in <module>
    line = next(f)
TypeError: 'str' object is not an iterator

我希望能够将任何行设置为变量，并且从那里开始，我一直使用next()时，它会转到下一行。

重要提示：由于这是一项新功能，我的所有代码都已使用next(f)，因此该解决方案必须能够使用它。

Answer 1

尝试一下（使用itertools.islice）：

from itertools import islice

f = open('urls.txt', 'r+')
start_at = 3
file_iterator = islice(f, start_at - 1, None)

# to demonstrate
while True:
    try:
        print(next(file_iterator), end='')
    except StopIteration:
        print('End of file!')
        break

f.close()

urls.txt：

输出：

3
4
5
End of file!

此解决方案比readlines更好，因为它不会将整个文件加载到内存中，仅在需要时才加载文件的一部分。当islice可以这样做时，它也不会浪费时间来迭代前几行，这使其比@MadPhysicist的答案要快得多。

此外，考虑使用with语法来确保文件被关闭：

with open('urls.txt', 'r+') as f:
    # do whatever

Answer 2

readlines方法返回行的字符串列表。因此，当您使用readlines()[2]时，您将得到第三行作为字符串。在该字符串上调用next毫无意义，所以会出现错误。

最简单的方法是切片列表：readlines()[x:]给出了从行x开始的所有内容的列表。然后，您可以根据需要使用该列表。

如果您对迭代器充满信心，则可以使用iter内置函数将列表（或几乎所有东西）变成迭代器。然后，可以next使其完全满足您的需求。

Answer 3

您打印的行返回一个字符串：

open('urls.txt', "r+").readlines()[x]

open返回文件对象。其readlines方法返回字符串列表。用[x]编制索引会将文件中的第三行作为单个字符串返回。

第一个问题是您打开文件时没有关闭它。第二个是您的索引直到结束才指定行范围。这是一个逐步的改进：

with open('urls.txt', 'r+') as f:
    lines = f.readlines()[x:]

现在lines是您想要的所有行的列表。但是您首先将整个文件读入内存，然后丢弃前两行。另外，列表是可迭代的，而不是迭代器，因此要在列表上有效使用next，您需要采取额外的步骤：

lines = iter(lines)

如果您想利用文件已经是相当高效的迭代器这一事实，请根据需要多次使用next来丢弃不需要的行：

with open('urls.txt', 'r+') as f:
    for _ in range(x):
        next(f)
    # now use the file
    print(next(f))

在for循环之后，您对文件进行的任何读取操作都将从第三行开始，无论是next(f)，f.readline()等。

还有其他几种剥离第一行的方法。在所有情况下，包括上面的示例，next(f)都可以替换为f.readline()：

for n, _ in enumerate(f):
    if n == x:
        break

或

for _ in zip(f, range(x)): pass

运行这些循环中的任何一个之后，next(f)将返回第x行。

Answer 4

以下代码将允许您使用迭代器来打印第一行：

In [1]: path = '<path to text file>'                                                           

In [2]: f = open(path, "r+")                                                    

In [3]: line = next(f)

In [4]: print(line)

此代码将允许您打印从第 x 行开始的行：

In [1]: path = '<path to text file>'

In [2]: x = 2

In [3]: f = iter(open(path, "r+").readlines()[x:])

In [4]: f = iter(f)                                                             

In [5]: line = next(f)

In [6]: print(line)

编辑：根据@ Tomothy32的观察结果编辑解决方案。

Answer 5

只需致电next(f)即可。（不需要用itertools来使这个复杂化，也不需要用readlines来处理整个文件。）

lines_to_skip = 3

with open('urls.txt') as f:
    for _ in range(lines_to_skip):
        next(f)

    for line in f:
        print(line.strip())

输出：

% cat urls.txt
url1
url2
url3
url4
url5

% python3 test.py
url4
url5

如何从python中的任何行开始使用read next（）？

5 个答案: