Question

我编写了一个脚本，它从两个不同的文件中读取数据并相应地继续。然而，当我写剧本时，我的印象是我正在阅读的第一个文件只有两行，遗憾的是，这已经改变了。

我的代码提取前两行并将数据传递给另一个函数，然后通过多个其他函数继续进行计算。

现在我正在做这样的事情：

try:
    file = open(myfile, 'r')
    for line in file:
        if line[0] != '|':
            name = line.strip('\n')
        else:
            data = line.strip('|\n')

该文件通常如下所示：

Samantha
|j&8ju820kahu9|

现在，遗憾的是，我可以拥有一个可以有多行的文件，如下所示：

Andy
|o81kujd0-la88js|
Mathew
|a992kma82nf-x01j4|
Andrew
|01ks83nnz;a82jlad|

有没有办法可以从文件中一次提取两行？处理它们然后继续提取两个？所以抓住前两行，给它们命名+数据，将它传递给我的函数，最后打印所需的内容，然后得到新的两行等等。

请建议。

Answer 1

是的，因为文件上下文也是一个迭代器：

with open(filename, 'r') as f:
    for l1, l2 in zip(f, f):
        # ... do something with l1 and l2

这是最简短，最恐怖的方式。

Answer 2

您的解决方案可能是：

data = {}
with open(filename) as f:
    for name, value in zip(f, f):
        data[name] = value

有关使用迭代器的zip函数的说明，请查看documentation。

此外，这是来自itertools文档中的配方：

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

Answer 3

当然可以。

okay = False
with open(...) as f:
  while True:
    okay = False
    try:
      line_1 = next(f)
      line_2 = next(f)
      okay = True
      # ... do something with the pair of lines
    except StopIteration:
      break;  # End of file.
if not okay:
   complain("The file did not contain an even number of lines")

Answer 4

您可以使用列表拼接表示法list[<begin>:<end>:<step>]在迭代时跳过列表元素。如果您的文件很小，您可以使用readlines()

将其一次性读入内存

考虑这样的事情不要使用file作为文件句柄。它隐藏了内置file

In [9]: a = my_file.readlines()
In [10]: for i, line in enumerate(a[::2]):
   ...:     data_line = a[i+1]
   ...:     name = line.strip('\n')
   ...:     data = data_line.strip("|\n")
   ...:     print name
   ...:     print data
   ...:
Andy
o81kujd0-la88js
Mathew
Mathew
Andrew
a992kma82nf-x01j4

In [11]:

（我个人会做类似正则表达式的比赛）。

Answer 5

试试这个

from itertools import islice
with open(filename, 'r') as infile:
    current_slice = islice(infile, N)
for line in current_slice:
    print line

其中N是您要处理的行数，current_slice是生成器对象，它为您提供文件的每一行，并且可以在循环中使用。这应该一次给你两行。您可以执行操作，然后继续执行下两行

，而不是打印

另一种选择是

from itertools import izip_longest

with open(filename) as f:
     for lines in grouper(f, N, ''):
         for line in lines:
             # process N lines here

从段中读取文件

5 个答案: