使用Python在行中找到多个关键字

时间:2016-03-13 07:07:14

标签: python-3.x

我有这样的一句话:

  

20:28:26.684597 24:d5:6e:76:9s:10(oui Unknown)> 45:83:R4:7U:787-9:I2   (oui Unknown),ethertype 802.1Q(0x8100),长度78:vlan 64,p 0,   ethertype IPv4,(tos 0x48,ttl 34,id 5643,offset 0,flags [none],   proto TCP(6),长度60)192.168.45.28.56982>   172.68.54.28.webcache:Flags [S],cksum 0xg654(正确),seq 576485934,win 65535,options [mss 1460,sackOK,TS val 2544789 ecr   0,wscale 0,eol],长度0

在这一行中,我需要找到“id 5643”的ID值和192.168.45.28.56982的另一个值(56982)。 在这些“id”中将是常量,192.168.45.28是常量。

我编写了这样的脚本,请提示一种缩短代码的方法,因为在我的脚本中涉及多个步骤:

file = open('test.txt')
fi = file.readlines()

for line in fi:
    test = (line.split(","))
    for word2 in test:
        if "id" in word2:
            find2 = word2.split(" ")[-1]
            print("************", find2)
    for word in test:
        if "192.168.45.28" in word:
            find = word.split(".")
            print(find)
            for word1 in find:
                if ">" in word1:
                    find1 = word1.split(">")[0]
                    print(find1)

3 个答案:

答案 0 :(得分:2)

您可以使用正则表达式:

module MyMixin
    module ClassMethods
        .... 
    end

    module InstanceMethods
        ....
    end

    def self.included(receiver)
        namespace, table = receiver.name.underscore.pluralize.split('/')
        receiver.extend         ClassMethods
        receiver.send :include, InstanceMethods
        receiver.instance_variable_set :@namespace, namespace.to_sym
        receiver.instance_variable_set :@table, table.to_sym
        receiver.instance_variable_set :@properties, {}
    end
end

请参阅the Python regular expression docs

答案 1 :(得分:2)

与其他方法相同。它不会在结果中添加空列表,但它会编译正则表达式以提高效率,它不会一次性将整个文件读入内存而且它不会使用id作为变量名称(它是一个内置功能,以便最好避免它)。 输出中可能存在重复项(我不能只假设您只想要唯一的条目)。

import re

re_id = re.compile("id (\d+)")
re_ip = re.compile("192\.168\.45\.28\.(\d+)")

ids = []
ips = []

with open("test.txt", "r") as f:
    for line in f:
        id_res = re_id.findall(line)
        if any(id_res):
            ids.append(id_res[0])
        ip_res = re_ip.findall(line)
        if any(ip_res):
            ips.append(ip_res[0])

答案 2 :(得分:0)

您可以使用正则表达式。这里有更多信息:https://docs.python.org/2/library/re.html

你可以像这样写

import re
file = open('test.txt')
fi = file.readlines()

for line in fi:
    match = re.match('.*id (\d+).*',line)
    if match:
        print("************ %s" % match.group(1))
    match = re.match('.*192\.168\.45\.28\.(\d+).*',line)
    if match:
        print(match.group(1))

** **更新

正如jDo指出最好使用findall,编译正则表达式,不要使用readlines,所以你会得到这样的东西:

import re

re_id = re.compile("id (\d+)")
re_ip = re.compile("192\.168\.45\.28\.(\d+)")
with open("test.txt", "r") as f:
    for line in f:
        match = re.findall(re_id,line)
        if match:
            print("************ %s" % match.group(1))
        match = re.findall(re_ip,line)
        if match:
            print(match.group(1))