Python - 使用re在if语句的导入csv的值中搜索模式

时间:2017-10-30 23:17:44

标签: python regex csv

首先,我为此成为一名菜鸟而道歉。我有以下代码打开CSV文件并读取它。我只想在名为“Source IP / Details”的字段中返回包含公共IP地址的行,以及我知道正在工作的行['Status']字段。我有我认为是正确的正则表达式,但我不确定我正在进行正确的搜索。另外,我不确定我是否在以下for语句中正确设置变量。

ipRegex = '\b(?!(10)|192\.168|172\.(2[0-9]|1[6-9]|3[0-2]))[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'

with open('whois.csv') as csvDataFile:
csvReader = csv.DictReader(csvDataFile)
rows = [row for row in csvReader if row['Status'] != "Closed" and row['Status'] != "Resolved"] and row['Source IP / Details'] == re.search(ipRegex, row['Source IP / Details'])]

for row in rows:
    case = row['Case Number']
    ipaddr = row['Source IP / Details']

以下是我的数据示例:

Case Number,Status,Date/Time Opened,_BATCH_ID_,_BATCH_LAST_RUN_,Alert Source,Alert Subtype,Source IP / Details,
2926,Closed,2015-10-29T11:54:00,2130,2017-10-30T22:48:02,Sophos,[MEDIUM] Alert for Sophos Cloud: A computer does not comply with its Cloud po...,,
7733,Closed,2015-11-18T13:46:00,2130,2017-10-30T22:48:02,Dell SecureWorks,Malicious Network Activity,216.30.178.102,
7818,Closed,2015-11-18T20:58:00,2130,2017-10-30T22:48:02,Dell SecureWorks,Application-Specific Exploits GNU Bash Environment Variable Code Injection attempt(s),,
7850,Closed,2015-11-18T21:47:00,2130,2017-10-30T22:48:02,Dell SecureWorks,Vulnerability Scanning,173.166.95.81,

1 个答案:

答案 0 :(得分:0)

在寻找匹配时,您需要检查is not None。此外,正如您的评论中所述,您的表达中存在错误。我将最后一行更改为Open以生成结果:

from io import StringIO
import csv, re

data = """Case Number,Status,Date/Time Opened,_BATCH_ID_,_BATCH_LAST_RUN_,Alert Source,Alert Subtype,Source IP / Details,
2926,Closed,2015-10-29T11:54:00,2130,2017-10-30T22:48:02,Sophos,[MEDIUM] Alert for Sophos Cloud: A computer does not comply with its Cloud po...,,
7733,Closed,2015-11-18T13:46:00,2130,2017-10-30T22:48:02,Dell SecureWorks,Malicious Network Activity,216.30.178.102,
7818,Closed,2015-11-18T20:58:00,2130,2017-10-30T22:48:02,Dell SecureWorks,Application-Specific Exploits GNU Bash Environment Variable Code Injection attempt(s),,
7850,Open,2015-11-18T21:47:00,2130,2017-10-30T22:48:02,Dell SecureWorks,Vulnerability Scanning,173.166.95.81,"""

rx = re.compile(r'^([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(?<!172\.(16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31))(?<!127)(?<!^10)(?<!^0)\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(?<!192\.168)(?<!172\.(16|17|18|19|20|21|22|23|24|25|26|27|28|29|30|31))\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(?<!\.255$)$')

with StringIO(data) as csvDataFile:
    csvReader = csv.DictReader(csvDataFile)

    rows = [row for row in csvReader 
            if row['Status'] != 'Closed' and row['Status'] != 'Resolved' 
            and rx.search(row['Source IP / Details']) is not None]
    print(rows)

这会产生

[OrderedDict([('Case Number', '7850'), ('Status', 'Open'), ('Date/Time Opened', '2015-11-18T21:47:00'), ('_BATCH_ID_', '2130'), ('_BATCH_LAST_RUN_', '2017-10-30T22:48:02'), ('Alert Source', 'Dell SecureWorks'), ('Alert Subtype', 'Vulnerability Scanning'), ('Source IP / Details', '173.166.95.81'), ('', '')])]