Python:使用另一个文件作为密钥从文件中提取行

时间:2014-04-02 14:37:27

标签: python extract lines

我有一个'key'文件,看起来像这样(MyKeyFile):

afdasdfa ghjdfghd wrtwertwt asdf(这些是在列中,但我从来没有想过格式化,对不起)

我调用这些键,它们与我想从“源”文件中提取的行的第一个单词相同。所以源文件(MySourceFile)看起来像这样(再次,格式错误,但第一列=密钥,跟随列=数据):

afdasdfa(几个制表符分隔的列) 。 。 ghjdfghd(几个制表符分隔的列) 。 wrtwertwt 。 。 ASDF

和'。'表示当前没有兴趣的行。

我是Python的绝对新手,这就是我走了多远:

with open('MyKeyFile','r') as infile, \
open('MyOutFile','w') as outfile:
    for line in infile:
        for runner in source:
            # pick up the first word of the line in source
            # if match, print the entire line to MyOutFile
            # here I need help
outfile.close()

我意识到可能有更好的方法来做到这一点。所有的反馈都是值得赞赏的 - 沿着我解决它的方式,或者更复杂的反馈。

由于 JD

2 个答案:

答案 0 :(得分:1)

我认为这将是一种更清洁的方式,假设你的关键"文件被称为" key_file.txt"并且您的主文件名为" main_file.txt"

keys = []
my_file = open("key_file.txt","r") #r is for reading files, w is for writing to them.
for line in my_file.readlines():
    keys.append(str(line)) #str() is not necessary, but it can't hurt
#now you have a list of strings called keys. 
#take each line from the main text file and check to see if it contains any portion of a given key. 

my_file.close()
new_file = open("main_file.txt","r")
for line in new_file.readlines():
    for key in keys:
        if line.find(key) > -1: 
            print "I FOUND A LINE THAT CONTAINS THE TEXT OF SOME KEY", line

您可以使用包含某个键文本的所需行修改打印功能或删除它以执行您想要的操作。如果有效,请告诉我

答案 1 :(得分:0)

正如我所理解的(如果我错了,请在评论中纠正我),你有3个文件:

  1. MySourceFile
  2. MyKeyFile
  3. MyOutFile
  4. 你想要:

    1. 从MyKeyFile读取密钥
    2. 从MySourceFile阅读来源
    3. 迭代源
    4. 中的行
    5. 如果行的第一个单词在键中:将该行附加到MyOutFile
    6. 关闭MyOutFile
    7. 所以这是代码:

      with open('MySourceFile', 'r') as sourcefile:
          source = sourcefile.read().splitlines()
      
      with open('MyKeyFile', 'r') as keyfile:
          keys = keyfile.read().split()
      
      with open('MyOutFile', 'w') as outfile:
          for line in source:
              if line.split():
                  if line.split()[0] in keys:
                      outfile.write(line + "\n")
      outfile.close()