Question

我正在尝试读取文件，收集一些行，批处理它们然后对结果进行后处理。

示例：

with open('foo') as input:
    line_list = []
    for line in input:
        line_list.append(line)
        if len(line_list) == 10:
            result = batch_process(line_list)
            # something to do with result here
            line_list = []

    if len(line_list) > 0: # very probably the total lines is not mutiple of 10 e.g. 11
        result = batch_process(line_list)
        # something to do with result here

我不想复制批量调用和后期处理，所以我想知道是否可以动态地将一些内容添加到input，例如

with open('foo') as input:
    line_list = []
    # input.append("THE END")
    for line in input:
        if line !=  'THE END':
            line_list.append(line)
        if len(line_list) == 10 or line == 'THE END':
            result = batch_process(line_list)
            # something to do with result here
            line_list = []

因此，如果在这种情况下我无法复制if分支中的代码。或者，如果有任何其他更好的方式可以知道它的最后一行？

Answer 1

如果您的输入不是太大并且在内存中非常舒适，您可以将所有内容读入列表，将列表切片为长度为10的子列表并循环显示。

k = 10
with open('foo') as input:
    lines = input.readlines()
    slices = [lines[i:i+k] for i in range(0, len(lines), k)]
    for slice in slices:
        batch_process(slice)

如果要在输入行上添加标记，还必须先读取所有行。

如何动态地将某些标记内容附加到文件对象

1 个答案: