如何从未成形的字符串中提取ip和userid

时间:2013-06-11 22:50:12

标签: python

我有一个字符串

 Jun 11 02:47:04 webwork-tlv tcp: 2013-06-11 02:47:04 - ive - [84.11.11.11] hacker(Secure ID)[Manage System] - Host Checker policy 'Machine center' passed on host 84.11.11.11  for user 'hacker'

某些字符串看起来像

 Jun 11 00:13:26 webwork-tlv tcp: 2013-06-11 00:13:25 - ive - [10.11.12.19] hacker(Secure ID)[Manage System] - Sensor tlv-entid-001 - timestamp=[Tue Jun 11 02:23:42 2013 ] severity=[4] policyStr=[IDP 20110132] category=[attack] protocol=[tcp] attackStr=[HTTP:XSS:HTML-SCRIPT-IN-URL-VA] rulebaseStr=[IDS] rulebaseType=[Main Rule Base] srcAddr=[10.11.12.19] srcPort=[3333] dstAddr=[66.11.12.13] dstPort=[80] action=[drop] policyVersion=[41] ruleNumber=[3]

我想在开头提取日期,在[]之间提取ip但是如果它是内部ip(从10或192开始),则无需提取和识别黑客(SecureID)

所以结果应该是ip:84.11.11.11,id:hacker

提前谢谢

2 个答案:

答案 0 :(得分:2)

>>> regex = re.compile("(\[\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\]) ([a-zA-Z0-9]+)")
>>> r = regex.search(string)

# List the groups found
>>> r.groups()
(u'[84.11.11.11]', u'hacker')

答案 1 :(得分:0)

有点乏味,但是:

s = "Jun 11 02:47:04 webwork-tlv tcp: 2013-06-11 02:47:04 - ive - [84.11.11.11] hacker(Secure ID)[Manage System] - Host Checker policy 'Machine center' passed on host 84.11.11.11  for user 'hacker'."
parts = s.split('[')[1].split(']')
{'ip': parts[0], 'id': parts[1].split('(Secure ID)')[0]}