Powershell - 从uri中拉出一个子串

时间:2016-11-08 15:17:39

标签: powershell

我试图拉这个

hrbkr.com
smqzc.com
znynf.com

来自$ temp中的uri列表 -

anything.anything.hrbkr.com
anything.anything.smqzc.com
anything.anything.znynf.com

这个正则表达式似乎至少在regex101上匹配 -

(<domainname>(?<ip>^[A-Fa-f\d.:]+$)|(?<nodots>^[^.]+$)|(?<fqdomain>(?:(?:[^.]+.)?(?<tld>(?:[^.\s]{2})(?:(?:.[^\.\s][^\.\s])|(?:[^.\s]+)))))$)*?'

但是这似乎没有给我任何结果,我能够让它与整条线匹配,但我只想要&#39; substring&#39;如果线匹配则不是真的。

$temp = ‘c:\Users\money\Downloads\phishinglist.txt’
$regex = '(<domainname>(?<ip>^[A-Fa-f\d.:]+$)|(?<nodots>^[^.]+$)|(?   <fqdomain>(?:(?:[^.]+.)?(?<tld>(?:[^.\s]{2})(?:(?:.[^\.\s][^\.\s])|(?:[^.\s]+)))))$)*?'
$temp | select-string -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } | Sort-Object -Unique > $list
$list

谢谢!

1 个答案:

答案 0 :(得分:4)

如果文件只包含FQDN而不包含任何其他内容,则可以使用简单的-split-join操作轻松解决该问题:

# Split FQDN into individual labels
$labels = 'anything.anything.smqzc.com' -split '\.'

# Grab second-to-last and last label, join with a dot
$domain = $labels[-2,-1] -join '.'

或者在一个声明中:

$domain = ("anything.anything.smqzc.com" -split '\.')[-2,-1] -join '.'

所以你的程序最终看起来像:

$list = Get-Content $HOME\Downloads\phishinglist.txt |ForEach-Object {
    ($_ -split '\.')[-2,-1] -join '.'
}