我的目标是将字符串转换为字典。这是它的样子:
[exploit] => 1
[hits] => 1
[completed] => 1
[is_malware] => 1
[summary] => 26.0@13965: suspicious.warning: object contains JavaScript
76.0@14467: suspicious.obfuscation using eval
76.0@14467: suspicious.obfuscation using String.fromCharCode
[severity] => 4
[engine] => 60
所以我尝试了几种方法来做到这一点,第一次尝试是split
\n
,但我遇到的问题是[摘要],内容被拆分,所以没有工作。然后我的第二次尝试是split
=>
但是我遇到了问题,一旦我在=>分裂它不会知道必须在\n
分割下一个键。基本上它最终应该看起来像这样
{exploit:1,点击次数:1,已完成:1 ....}等等
非常感谢任何帮助。
答案 0 :(得分:7)
您可以使用re.findall
来解析文字:
>>> import re
>>> re.findall('\[([^]]+)\] => (.*?)(?=\n\[|$)', s, re.S)
[('exploit', '1'), ('hits', '1'), ('completed', '1'), ('is_malware', '1'), ('summary', '26.0@13965: suspicious.warning: object contains JavaScript\n76.0@14467: suspicious.obfuscation using eval\n76.0@14467: suspicious.obfuscation using String.fromCharCode\n'), ('severity', '4'), ('engine', '60')]
您可以通过调用dict
将这些值放入字典中。
>>> dict(re.findall('\[([^]]+)\] => (.*?)(?=\n\[|$)', s, re.S))
{'engine': '60', 'hits': '1', 'severity': '4', 'is_malware': '1', 'summary': '26.0@13965: suspicious.warning: object contains JavaScript\n76.0@14467: suspicious.obfuscation using eval\n76.0@14467: suspicious.obfuscation using String.fromCharCode\n', 'exploit': '1', 'completed': '1'}
答案 1 :(得分:0)
total_string = """\
[exploit] => 1
[hits] => 1
[completed] => 1
[is_malware] => 1
[summary] => 26.0@13965: suspicious.warning: object contains JavaScript
76.0@14467: suspicious.obfuscation using eval
76.0@14467: suspicious.obfuscation using String.fromCharCode
[severity] => 4
[engine] => 60
"""
import re
pattern_RE = '\[([^]]+)\] => (.*?)(?=\n\[|$)'
report_dict = dict(re.findall(pattern_RE, total_string, re.S))
for k, v in report_dict.items():
print('[{}]: {}'.format(k, v))
print(report_dict)
现在你向我们展示的是这个,但可能会有新行和回车隐藏。我们可以看到正则表达式似乎没问题。
{ 'engine': '60',
'hits': '1',
'severity': '4',
'is_malware': '1',
'summary': '(all three captured)',
'exploit': '1',
'completed': '1'
}
因此,如果正则表达式没有抓住这个,那么total_string的repr()必须与你粘贴的内容略有不同(可能是尾随的换行符,或其他东西)