我目前正在使用以下代码将所有'转换为':
#Converts ' to "
lines = []
replacements = {"'":'"'}
with open('netstat_data_IP_formatted.json') as infile:
for line in infile:
for src, target in replacements.iteritems():
line = line.replace(src, target)
lines.append(line)
with open('netstat_data_IP_formatted.json', 'w') as outfile:
for line in lines:
outfile.write(line)
,这工作正常,但我想选择所有端口并删除端口周围的所有格式,所以使用像这样的工作的正则表达式,而不是拿起IP地址?,
^([0-9]{1,4}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$
然后我如何删除端口号码周围的格式,或者是否有更简单的方法来制作这个?
所以这是输入
{'l_port': '48856', 'r_host': '95.211.210.72', 'r_port': '443', 'state': 'ESTAB$
{'l_port': '443', 'r_host': '37.218.247.217', 'r_port': '35805', 'state': 'TIME$
{'l_port': '48662', 'r_host': '95.211.210.72', 'r_port': '443', 'state': 'ESTAB$
{'l_port': '51316', 'r_host': '91.194.90.103', 'r_port': '443', 'state': 'ESTAB$
这是脚本运行后的方式
{"l_port": "48698", "r_host": "95.211.210.72", "r_port": "443", "state": "ESTAB$
{"l_port": "40406", "r_host": "178.62.252.82", "r_port": "443", "state": "TIME_$
{"l_port": "443", "r_host": "60.191.48.203", "r_port": "58220", "state": "SYN_R$
{"l_port": "36058", "r_host": "37.252.185.87", "r_port": "443", 'state': 'TIME_$
这就是我想要的方式
{"l_port": 48698, "r_host": "95.211.210.72", "r_port": 443, "state": "ESTAB$
{"l_port": 40406, "r_host": "178.62.252.82", "r_port": 443, "state": "TIME_$
{"l_port": 443, "r_host": "60.191.48.203", "r_port": 58220, "state": "SYN_R$
{"l_port": 36058, "r_host": "37.252.185.87", "r_port": 443, 'state': 'TIME_$
答案 0 :(得分:0)
您只需要将数字与引号相匹配但不包含句号:
re.sub("'([0-9]+)'",'\\1',line)
(当然,如果先更换它们,请交换引号。)
那就是说,如果你需要做任何更复杂的事情,你应该使用标准的json
包来实际解析文件,然后根据你的喜好格式化数据。