我是python和regex的新手,我一直试图将IP地址日志隐藏在txt文件中。我应该避免使用for循环和if检查-如果可能的话,因为txt文件很大(158MB)。
(所有IP地址均以172开头)
这是我尝试的代码:
import re
txt = "test"
x = re.sub(r"^172\.*", "XXX.\", txt)
print(x)
示例txt文件:
ABCDEFGHIJKLMNOPRST172.12.65.10RSTUVYZ
ASDG172.56.23.14FSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!ÂSDBSDF172.23.23.23SADASFSA
ASGFGD 172.12.23.56 ASDSAFASFDASSADSA
所需的输出:
ABCDEFGHIJKLMNOPRSTXXX.XXX.XXX.XXXRSTUVYZ
ASDGXXX.XX.XX.XXFSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!ÂSDBSDFXXX.XXX.XXX.XXXSADASFSA
ASGFGD XXX.XXX.XXX.XXX ASDSAFASFDASSADSA
答案 0 :(得分:0)
您确实应该使用re.sub
。
re.sub("(172)(\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})", r"XXX.XXX.XXX.XXX", tested_addr)
关于正则表达式的说明(您并不需要真正需要的组,但是这是理解正则表达式各部分的好方法:
^(172)(\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})$
^ asserts position at start of a line
1st Capturing Group (172)
172 matches the characters 172 literally (case sensitive)
2nd Capturing Group (\.(?:[0-9]{1,3}\.){2}[0-9]{1,3})
\. matches the character . literally (case sensitive)
Non-capturing group (?:[0-9]{1,3}\.){2}
{2} Quantifier — Matches exactly 2 times
Match a single character present in the list below [0-9]{1,3}
{1,3} Quantifier — Matches between 1 and 3 times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
\. matches the character . literally (case sensitive)
Match a single character present in the list below [0-9]{1,3}
{1,3} Quantifier — Matches between 1 and 3 times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
$ asserts position at the end of a line
答案 1 :(得分:0)
使用:172(?:\.\d{1,3}){3}
代码:
string = r'''ABCDEFGHIJKLMNOPRST172.12.65.10RSTUVYZ
ASDG172.56.23.14FSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!SDBSDF172.23.23.23SADASFSA
ASGFGD 172.12.23.56 ASDSAFASFDASSADSA'''
print re.sub(r'172(?:\.\d{1,3}){3}', "XXX.XXX.XXX.XXX", string)
输出:
ABCDEFGHIJKLMNOPRSTXXX.XXX.XXX.XXXRSTUVYZ
ASDGXXX.XXX.XXX.XXXFSDGHSFSDFDSFHSF
!'^%%&!'+!'+^%&!SDBSDFXXX.XXX.XXX.XXXSADASFSA
ASGFGD XXX.XXX.XXX.XXX ASDSAFASFDASSADSA