Question

我的代码的目的是迭代数组中的每个元素，将元素转换为字符串，并从包含此字符串的另一个文件返回行。我的代码是：

    for element in myarray:
         elementstring=''.join(element)
         for line in myfile:
              if elementstring in line:
                  print line

如果代码运行，它只适用于第一个元素。有人可以解释为什么会这样吗？

Answer 1

这种情况正在发生，因为当您通读一次文件的行时，您到达文件的末尾并且没有剩余的行可供阅读。您需要关闭该文件并重新打开，以便为每个element阅读。

这是一种方法：

for element in myarray:
    elementstring=''.join(element)
    with open('path/to/myfile') as myfile:
        for line in myfile:
            if elementstring in line:
                print line

或者，如果这是一个足够小的文件，你可以通过在文件中缓存文件中的行来避免磁盘中的几个read来减少运行时，如下所示：

myfile = [line.rstrip('\n') for line in open('path/to/myfile')]
for element in myarray:
    elementstring=''.join(element)
    for line in myfile:
        if elementstring in line:
            print line

Answer 2

你通过一个文件...将指针移动到结尾...你需要重新打开文件或myfile.seek(0) ...但你的代码还有一些其他问题。如果没有看到myarray

，很难回答

Answer 3

with open(myfile) as f:
    lines=[x for x in f] #store all lines in a list first
    for element in myarray:    #now iterate over myarray
         elementstring=''.join(element)
         for line in lines:            #now iterate over individual line from lines
              if elementstring in line:
                  print line

Answer 4

正如其他人所说，文件不是集合。按顺序读取文件，您需要使用seek函数在每次迭代时返回第一行。

无论如何，这并不是做你想做的最好的方式。

从文件读取通常比从RAM读取（即使使用缓存）慢，因此最好让主循环通过文件。

预先计算外部阵列上的所有字符串值也会更好。

最后，有很多算法可以搜索您可能会考虑的文件（或更大的字符串）中的一组字符串。

以下是代码的优化版本：

strs = [' '.join(element) for element in myarray]
for line in open(''path/to/myfile'):
    for elementstring in strs:
         if elementstring in line:
              print line

Python嵌套for循环以将数组元素中的字符串与文件匹配

4 个答案: