Question

我在获取一些嵌套的for循环时遇到了一些麻烦。以我需要的方式工作。我一直在寻找尝试找到一个答案，这似乎经常发生，但由于我自己还不熟悉Python，所以解释并没有太大帮助。我有一个单词列表和一个文件，我正在使用，并希望列表中的每个单词逐行遍历文件，如果该行包含单词，则打印该行。目前，当我运行代码时，它只打印出包含列表中第一个单词的行，而不会继续使用列表中的其余单词。

请您提供一些关于如何使这项工作的建议？

注意：我知道＆＃39;效率＆＃39;拼写不正确，这是数据源的问题。

编辑：我需要对这些行进行分组，因此所有行都包含＆＃39; Speed＆＃39;打印，然后所有包含＆＃39;加速＆＃39;文件中的所有行都包含SHEETS中的一个单词。

SHEETS = [' Speed',' Acceleration',' Engine Power',' Instantaneous Fuel Effeciency',
          ' Average Fuel Effeciency',' Instantaneous MPG',' Average MPG',
          ' MAF air flow rate',' Accelerator pedal position E',
          ' Commanded throttle actuator']


with open('userdata.log','r',encoding = 'utf-8') as my_file:
    for label in SHEETS:
        for line in my_file:
            if label in line:
                print (line)

输出：

2014-09-20 14：08：41.165，Speed，0，mph

2014-09-20 14：08：43.742，Speed，0，mph

2014-09-20 14：08：47.872，Speed，0，mph

2014-09-20 14：08：49.490，Speed，0，mph

2014-09-20 14：08：51.007，Speed，0，mph

2014-09-20 14：08：52.456，Speed，0，mph

2014-09-20 14：08：53.888，Speed，0，mph

2014-09-20 14：08：55.499，Speed，0，mph

2014-09-20 14：08：57.288，Speed，0，mph

2014-09-20 14：08：57.838，Speed，0，mph

2014-09-20 14：08：58.355，Speed，0，mph

2014-09-20 14：08：58.572，Speed，0，mph

Answer 1

这是因为第一次运行此循环：

for label in SHEETS:
    for line in my_file:

它遍历整个文件，然后停止（它没有＆＃t;＃34;倒带＆＃34;并从顶部再次开始）。所以它做的是取第一个单词并搜索整个文件......然后由于文件已被搜索（line在最后一行），它没有找到你的其他单词

在您的情况下，简单的解决方案是切换逻辑：为文件中的每一行，看它是否包含任何单词。这样，您可以搜索每一行的所有单词（而不是整个文件中效率更低的单词）。

最终结果是相同的 - 您将打印包含您所追求的单词的任何行。实现非常简单，只需切换循环的顺序：

with open('userdata.log','r',encoding='utf-8') as my_file:
    for line in my_file:
        for label in SHEETS:
            if label in line:
                print(line)

我需要对这些行进行分组，因此所有行都包含＆＃39; Speed＆＃39; 打印，然后所有包含＆＃39;加速＆＃39;所有文件中的行包含SHEETS中的一个单词。

啊，这是别的。为此，您需要使用字典，这是Python的键/值存储容器。

字典是指您可以存储或分组内容并通过密钥引用它们的地方。

在您的情况下，您希望将所有与单词匹配的行组合在一起，因此您的键将是单词，而事物将是行的集合。在字典中，每个键都有一个列表作为值（列表是众多容器类型之一，另一个是元组）。

lines_by_word = {}  # This is how you create an empty dictionary
with open('userdata.log', 'r', encoding='utf-8') as my_file:
   for line in my_file:
      for label in SHEETS:
          if label in line:
              # Now we have a match - next step is to
              # collect it. However, if this is the first time
              # we have encountered this word, we need to add it
              # to the dictionary
              if label not in lines_by_word:
                 # By default, dictionary return
                 # their keys in a "in" test (called a membership test)
                 # if the word doesn't exist, we need to create a blank
                 # list for it and add it to the dictionary
                 lines_by_word[label] = []

              lines_by_word[label].append(line) # Simply add the matching line
                                                # to the list for that word

for word,lines in lines_by_word.iteritems():
    print('There are total of {} lines for {}'.format(word, len(lines))
    for line in line:
        print(line)

Answer 2

我想也许你的意思是：

 SHEETS = [' Speed',' Acceleration',' Engine Power',' Instantaneous Fuel Effeciency',
      ' Average Fuel Effeciency',' Instantaneous MPG',' Average MPG',
      ' MAF air flow rate',' Accelerator pedal position E',
      ' Commanded throttle actuator']


with open('userdata.log','r',encoding = 'utf-8') as my_file:
     for line in my_file:
         for label in SHEETS:
            if label in line:
                print (line)

嵌套循环从外部到内部：对于文件中的每一行，检查该行中是否存在任何标签。

Answer 3

并非Python中的所有内容都支持重复迭代。通常，有两类迭代：迭代器，您只能迭代一次，以及多次使用的迭代，可以根据需要迭代多次。文件对象属于第一类。

如果获得您期望的特定结果顺序非常重要，可以在循环后将文件位置重置为开头：

with open('userdata.log','r',encoding = 'utf-8') as my_file:
    for label in SHEETS:
        for line in my_file:
            if label in line:
                print (line)
        my_file.seek(0)

您可能还会考虑更换循环的顺序并将行收集到每个标签的列表中，然后再将其打印出来。由于I / O较少，这可能会运行得更快：

labeled_lines = {label: [] for label in SHEETS}
with open('userdata.log','r',encoding = 'utf-8') as my_file:
    for line in my_file:
        for label in SHEETS:
            if label in line:
                labeled_lines[label].append(line)
                break
        else:
            # else on a loop means "if the loop didn't end with a break."
            raise SomeAppropriateException
for label in SHEETS:
    for lines in labeled_lines[label]:
        print(line)

最后，您从文件中读取的行通常会在结尾处包含换行符。（唯一可能的例外是文件的最后一行。）由于print添加了自己的换行符，因此在每行输出后都会产生一个空行。您可能希望删除换行符以避免这种情况。

修复嵌套的循环

3 个答案: