Question

在for line in f:期间，我的代码会保存包含特定数据的行。不幸的是，我必须阅读整个文件，而不是最重要的数据。在第二次，我必须检查整个文件（在5000-8000行之间），直到我多次获得正确的行（对于每个数据）。

所以，我的问题是，可以打开文件并转到特定行，阅读并再次执行。我看到了不同的答案，但我无法将所有文件保存在str中，因为我的设备上没有这么多内存......这就是我想直接在文件中搜索的原因。 / p>

Answer 1

使用迭代器和生成器，文件xreadlines（python 2）进行 lazily 评估，以便在使用之前文件没有加载到内存中：

def drop_and_get(skiping, it):
    for _ in xrange(skiping):
        next(it)
    return next(it)
f = xrange(10000)#lets say your file is this generator
drop_and_get(500, iter(f))
500

所以你可以这样做：

with open(yourfile, "r") as f:
    your_line = drop_and_get(5000, f.xreadlines())
    print your_line

实际上你甚至可以跳过xreadlines，因为文件对象本身就是一个迭代器

with open(yourfile, "r") as f:
    your_line = drop_and_get(5000, f)
    print your_line

Answer 2

Daniel解决方案非常好。更简单的替代方法是循环文件句柄并在达到所需行时中断。然后你可以恢复循环来实际处理这些行。

请注意，除非行的大小没有改变（在这种情况下你可以记住文件位置和seek之后没有奇迹）：你必须从头开始读取所有文件数据。您只需使用readlines()将其存储到内存中即可。切勿使用readlines()

这是我的天真方法，不使用生成器或复杂的东西，但是效率高，简单：

# skip first 5000 lines
for i,line in enumerate(f):
    if i == 5000:
       break

# process the rest of the file
for line in f:
    print(line.rstrip())

Answer 3

下面，您可以找到我的代码：

with open(leases_file,'r') as f:
    for line in f:
        # save the line numbers
    for l in list_ip.values(): # do it for each line saved
        f.seek(0) # go back from the beginning
        for i, line in enumerate(f): 
            # Looking for the good line
            if q == (l-1): # l contain the line number
                break
        for line in f:
            # read the data

我今天早上再次尝试，也许是因为我'f.seek（0）'？这是我和你的代码之间的唯一区别。

转到文件中的特定行

3 个答案: