我有一个潜在恶意软件行为的示例,我想透露所有网络指标,例如它所连接的网站名称和IP地址。
通过使用字符串输出我得到
$ strings 6787c54e6a2c5cffd1576dcdc8c4f42c954802b7
%PDF-1.5
1 0 obj
<</Type/Page/Parent 80 0 R/Contents 36 0 R/MediaBox[0 0 612 792]/Annots[2 0 R 4 0 R 6 0 R 8 0 R 10 0 R 12 0 R 14 0 R 16 0 R 18 0 R]/Group 20 0 R/StructParents 1/Tabs/S/Resources<</Font<</F1 21 0 R/F2 23 0 R/F3 26 0 R/F4 29 0 R/F5 31 0 R>>/XObject<</Image6 33 0 R/Image9 34 0 R>>>>>>
endobj
2 0 obj
<</Type/Annot/Subtype/Link/Rect[139.10001 398.20001 449.84 726.20001]/Border[0 0 0]/F 4/NM(PDFE-48D407B4789BA8880)/P 1 0 R/StructParent 0/A 3 0 R>>
endobj
3 0 obj
<</S/URI/URI(http://www.pdfupdatersacrobat.top/website/hts-cache/index.php?userid=info@narainsfashionfabrics.com)>>
endobj
4 0 obj
<</Type/Annot/Subtype/Link/Rect[232.39999 618.03003 370.14999 629.53003]/Border[0 0 0]/F 4/NM(PDFE-48D407B4789BA8881)/P 1 0 R/StructParent 2/A 5 0 R>>
endobj
5 0 obj
<</S/URI/URI(>>
endobj
6 0 obj
<</Type/Annot/Subtype/Link/Rect[278.87 583.20001 324.88 594.13]/Border[0 0 0]/F 4/NM(PDFE-48D407B4789BA8882)/P 1 0 R/StructParent 3/A 7 0 R>>
endobj
7 0 obj
<</S/URI/URI()>>
endobj
8 0 obj
<</Type/Annot/Subtype/Link/Rect[185.75999 377.28 398.16 733.67999]/Border[0 0 0]/C[0 0 0]/F 4/NM(PDFE-48D4183FB09C5EC13)/P 1 0 R/A 9 0 R/H/N>>
endobj
9 0 obj
<</S/URI/URI(http://sajiye.net/file/website/file/main/index.php?userid=alwaha_alghannaa@hotmail.com)>>
endobj
10 0 obj
<</Type/Annot/Subtype/Link/Rect[185.75999 373.67999 398.88 734.40002]/Border[0 0 0]/C[0 0 0]/F 4/NM(PDFE-48D4183FB09C5EC14)/P 1 0 R/A 11 0 R/H/N>>
endobj
11 0 obj
<</S/URI/URI(http://sajiye.net/file/website/file/main/index.php?userid=kitja@siamdee2558.com)>>
endobj
12 0 obj
<</Type/Annot/Subtype/Link/Rect[132.48 0 474.48001 772.56]/Border[0 0 0]/C[0 0 0]/F 4/NM(PDFE-48D460B5879C4D8C5)/P 1 0 R/A 13 0 R/H/N>>
endobj
13 0 obj
<</S/URI/URI(http://nurking.pl/wp-admin/user/email.163.htm?login=)>>
endobj
14 0 obj
<</Type/Annot/Subtype/Link/Rect[0 0 612 792]/Border[0 0 0]/C[0 0 0]/F 4/NM(PDFE-48D465334C760A446)/P 1 0 R/A 15 0 R/H/N>>
endobj
15 0 obj
<</S/URI/URI(https://www.dropbox.com/s/76jr9jzg020gory/Swift%20Copy.uue?dl=1)>>
endobj
16 0 obj
<</Type/Annot/Subtype/Link/Rect[.72 0 612 789.84003]/Border[0 0 0]/C[0 0 0]/F 4/NM(PDFE-48D4C7F946F3F02B7)/P 1 0 R/A 17 0 R/H/N>>
endobj
17 0 obj
<</S/URI/URI(https://www.dropbox.com/s/28aaqjdradyy4io/Swift-Copy_pdf.uue?dl=1)>>
endobj
18 0 obj
<</Type/Annot/Subtype/Link/Rect[0 5.76 612 792]/Border[0 0 0]/C[0 0 0]/F 4/P 1 0 R/A 19 0 R/H/N>>
endobj
19 0 obj
<</S/URI/URI(https://www.dropbox.com/s/d71h5a56r16u3f0/swift_copy.jar?dl=1)>>
endobj
20 0 obj
<</S/Transparency/CS/DeviceRGB>>
endobj
21 0 obj
<</Type/Font/Subtype/TrueType/BaseFont/TimesNewRoman/FirstChar 32/LastChar 252/Encoding/WinAnsiEncoding/FontDescriptor 22 0 R/Widths[250 333 408 500 500 833 777 180 333 333 500 563 250 333 250 277 500 500 500 500 500 500 500 500 500 500 277 277 563 563 563 443 920 722 666 666 722 610 556 722 722 333 389 722 610 889 722 722 556 722 666 556 610 722 722 943 722 722 610 333 277 333 469 500 333 443 500 443 500 443 333 500 500 277 277 500 277 777 500 500 500 500 333 389 277 500 500 722 500 500 443 479 200 479 541 350 500 350 333 500 443 1000 500 500 333 1000 556 333 889 350 610 350 350 333 333 443 443 350 500 1000 333 979 389 333 722 350 443 722 250 333 500 500 500 500 200 500 333 759 275 500 563 333 759 500 399 548 299 299 333 576 453 333 333 299 310 500 750 750 750 443 722 722 722 722 722 722 889 666 610 610 610 610 333 333 333 333 722 722 722 722 722 722 722 563 722 722 722 722 722 722 556 500 443 443 443 443 443 443 666 443 443 443 443 443 277 277 277 277 500 500 500 500 500 500 500 548 500 500 500 500 500]>>
endobj
22 0 obj
<</Type/FontDescriptor/FontName/TimesNewRoman/Flags 32/FontBBox[-568 -215 2045 891]/FontFamily(Times New Roman)/FontWeight 400/Ascent 891/CapHeight 693/Descent -215/MissingWidth 777/StemV 0/ItalicAngle 0/XHeight 485>>
endobj
23 0 obj
<</Type/Font/Subtype/TrueType/BaseFont/ABCDEE+Calibri,BoldItalic/FirstChar 32/LastChar 117/Name/F2/Encoding/WinAnsiEncoding/FontDescriptor 24 0 R/Widths[226 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 630 0 459 0 0 0 0 0 0 0 0 668 532 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 528 0 412 0 491 316 0 0 246 0 0 246 804 527 527 0 0 0 0 347 527]>>
endobj
24 0 obj
<</Type/FontDescriptor/FontName/ABCDEE+Calibri,BoldItalic/FontWeight 700/Flags 32/FontBBox[-691 -250 1265 750]/Ascent 750/CapHeight 750/Descent -250/StemV 53/ItalicAngle -11/AvgWidth 536/MaxWidth 1956/XHeight 250/FontFile2 25 0 R>>
endobj
<</Type/Pages/Count 1/Kids[1 0 R]>>
endobj
81 0 obj
<</Type/Catalog/Pages 80 0 R/Lang(en-US)/MarkInfo<</Marked true>>/Metadata 83 0 R/StructTreeRoot 37 0 R>>
endobj
82 0 obj
<</Producer(RAD PDF 2.36.8.0 - http://www.radpdf.com)/Author(alesk)/Creator(RAD PDF)/RadPdfCustomData(pdfescape.com-open-AC00E8D5A4B4C84BC37A2054F4EC794B0297765728CB8415)/CreationDate(D:20160825075202+01'00')/ModDate(D:20170711012532-08'00')>>
endobj
83 0 obj
<</Type/Metadata/Subtype/XML/Length 1031>>stream
<?xpacket begin="
" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="DynaPDF 4.0.11.30, http://www.dynaforms.com">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/">
<pdf:Producer>RAD PDF 2.36.8.0 - http://www.radpdf.com</pdf:Producer>
<xmp:CreateDate>2016-08-25T07:52:02+01:00</xmp:CreateDate>
<xmp:CreatorTool>RAD PDF</xmp:CreatorTool>
<xmp:MetadataDate>2017-07-11T01:25:32-08:00</xmp:MetadataDate>
<xmp:ModifyDate>2017-07-11T01:25:32-08:00</xmp:ModifyDate>
<dc:creator><rdf:Seq><rdf:li xml:lang="x-default">alesk</rdf:li></rdf:Seq></dc:creator>
<xmpMM:DocumentID>uuid:a184332f-8592-38c8-908c-45914e523218</xmpMM:DocumentID>
<xmpMM:VersionID>1</xmpMM:VersionID>
<xmpMM:RenditionClass>default</xmpMM:RenditionClass>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
endstream
endobj
84 0 obj
<</Type/XRef/Size 85/Root 81 0 R/Info 82 0 R/ID[<299C21286E590F03363518EFD9FBBF99><299C21286E590F03363518EFD9FBBF99>]/W[1 3 0]/Filter/FlateDecode/Length 239>>stream
cx?{
endstream
endobj
startxref
204273
%%EOF
有没有办法消化所有这些字符串,只使用任何正则表达式或任何其他方法提取域名或IP地址等网络指标。
欢迎提出建议
预期输出
http://www.pdfupdatersacrobat.top/website/hts-cache/index.php?userid=info@narainsfashionfabrics.com
http://sajiye.net/file/website/file/main/index.php?userid=alwaha_alghannaa@hotmail.com
http://ns.adobe.com/pdf/1.3/
答案 0 :(得分:1)
是的,这是可能的。您可以找到所有URL,然后使用后引用提取它们。您可以阅读有关反向引用的更多信息here。
# Pattern describing regular expression
pattern = re.compile(r'(\(https?[:_%A-Z=?/a-z0-9.-]+\))')
# List where we store all URLs
urls = []
# For each invoice pattern you find in string, append it to list
for url in pattern.finditer(string):
urls.append(url.group(1))
注意:强>
您应该使用pattern.finditter()
,因为这样您就可以通过您调用string
的文本中的所有模式结果进行迭代。来自 re.finditer 文档:
re.finditer(pattern,string,flags = 0) 返回一个迭代器让步 RE的所有非重叠匹配上的MatchObject实例 字符串中的模式。字符串从左向右扫描,并匹配 按找到的顺序返回。空匹配包含在 结果,除非他们触及另一场比赛的开始。
答案 1 :(得分:0)
由于findstr仅提供基本的RegEx功能,我建议使用PowerShell
(如有必要,请批量包装)
相反,RegEx并没有剥离http行的尾部:
> gc .\sample.txt |sls '^.*?(https?:\/\/.*)$'|%{$_.Matches.Groups[1].Value}
http://www.pdfupdatersacrobat.top/website/hts-cache/index.php?userid=info@narainsfashionfabrics.com)>>
http://sajiye.net/file/website/file/main/index.php?userid=alwaha_alghannaa@hotmail.com)>>
http://sajiye.net/file/website/file/main/index.php?userid=kitja@siamdee2558.com)>>
http://nurking.pl/wp-admin/user/email.163.htm?login=)>>
https://www.dropbox.com/s/76jr9jzg020gory/Swift%20Copy.uue?dl=1)>>
https://www.dropbox.com/s/28aaqjdradyy4io/Swift-Copy_pdf.uue?dl=1)>>
https://www.dropbox.com/s/d71h5a56r16u3f0/swift_copy.jar?dl=1)>>
http://www.radpdf.com)/Author(alesk)/Creator(RAD PDF)/RadPdfCustomData(pdfescape.com-open-AC00E8D5A4B4C84BC37A2054F4EC794B0297765728CB8415)/CreationDate(D:20160825075202+01'00')/ModDate(D:20170711012532-08'00')>>
http://www.dynaforms.com">
http://www.w3.org/1999/02/22-rdf-syntax-ns#">
http://ns.adobe.com/pdf/1.3/"
http://purl.org/dc/elements/1.1/"
http://ns.adobe.com/xap/1.0/"
http://ns.adobe.com/xap/1.0/mm/">
http://www.radpdf.com</pdf:Producer>
同样粗略的可能IP
> gc .\sample.txt |sls '^(.*?(\d{1,3}\.){3}\d{1,3}.*)$'|%{$_.Matches.Groups[1].Value}
<</Producer(RAD PDF 2.36.8.0 - http://www.radpdf.com)/Author(alesk)/Creator(RAD PDF)/RadPdfCustomData(pdfescape.com-open-AC00E8D5A4B4C84BC37A2054F4EC794B0297765728CB8415)/CreationDate(D:20160825075202+01'00')/ModDate(D:20170711012532-08'00')>>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="DynaPDF 4.0.11.30, http://www.dynaforms.com">
<pdf:Producer>RAD PDF 2.36.8.0 - http://www.radpdf.com</pdf:Producer>
Aliases used: gc = Get-Content sls = Select-String % = ForEach-Object