枚举收件箱中的电子邮件,然后单击电子邮件中的所有链接

时间:2019-03-21 13:50:39

标签: powershell

我正在尝试构建一个脚本,该脚本枚举Outlook收件箱中的所有电子邮件,然后单击所有电子邮件中的所有链接。到目前为止,这是我所拥有的,为了进行测试,我仅使用$emails变量来获取前10封电子邮件。

我认为我的问题是invoke-webrequest,因为在我测试的10封电子邮件中,只有4封似乎似乎已成功调用了Web请求(至少在Powershell ISE中,这表明我成功建立了连接)仅显示GET请求的200条响应。

所以我有另一个问题,即使连接失败,它仍会伸出手来建立连接,还是正则表达式的此问题的一部分未正确存储某些URL?我正在测试一些软件,该软件应该能够警告单击已知链接的已知坏域的链接。

我的第三个问题是我的-match正则表达式,我相信只匹配电子邮件中的第一个正则表达式URL链接,并将其存储在匹配的哈希表中。我希望它与电子邮件中的 ALL 链接相匹配,如果有人对此有所改进,请告诉我。

# Build the inbox ingestion
Add-type -assembly "Microsoft.Office.Interop.Outlook" | out-null
$olFolders = "Microsoft.Office.Interop.Outlook.olDefaultFolders" -as [type] 

# Create a new Comobject which leverages the advantages of the COM 
# interfaces for system administration
$outlook = new-object -comobject outlook.application

# Use the Microsoft Application Programming Interface
$namespace = $outlook.GetNameSpace("MAPI")
$folder = $namespace.getDefaultFolder($olFolders::olFolderInBox)
$emails = $folder.items | Select-Object Body | Select-Object -f 10 

# Build the empty Array to store url links
$URLArray = @()

# loop through all the emails within the inbox
foreach ($email in $emails) {
    # store a matched regex which is a url nd select the url from the stored hash table of $matches,
    # The values is a member of a method from the .NET framework
    $LinksEmail = $email -match "\b(?:(?:https?|ftp|file)://|www\.|ftp\.)(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[A-Z0-9+&@#/%=~_|$])"
    $values = $matches | select values

    # This is our first inner loop within the loop of email enumeration, at each email within the all emails loop this loop
    # will execute and store the values from the url matches hash table into a position within the
    # $URLArray array data structure
    foreach ($value in $values) {$URLArray += $value.values}
    # write-output $value.values
}

# this is not an inner loop but aloop after we have built our $URLArray array which uses a try-catch 
# block to attempt to invoke a web request which should be a stored url at each indexed position
# in the array
# write-output $URLArray
foreach ($item in $URLArray) {
    try {
        Invoke-WebRequest -verbose $Item
        write-output "This was successful"
    }
    catch { write-output "This Failed $item"}
}

1 个答案:

答案 0 :(得分:0)

您如何验证正则表达式?

这个...

  

$ LinksEmail = $ email -match

...对于第一个匹配项,它只会返回true或false并停止。

$UrlList = @'
this is the URL https://stackoverflow.com/&20%
http://stackoverflow.com
http://www.SomeSite.com this is oure main site
http://www.SomeSite.com
ftp://www.somesite.com
ftp://somesite.com
ftp\SomeSite.com
If you want the file go there: file://SomeSite.com
'@
($values = $UrlList -match "\b(?:(?:https?|ftp|file)://|www\.|ftp\.)(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#/%=~_|$?!:,.]*\)|[A-Z0-9+&@#/%=~_|$])")
True

($values = $matches | select values)

Values
------
{https://stackoverflow.com/&20%}

即使那样,当我将其放置在Sapien的PowerRegEx工具中并填充多个URL时,它也无法获取全部或仅一部分。当然取决于格式

URL是基础加上它后面的所有字符串,您当然只希望检查基础URL,而您根本不在此处显示的内容上进行任何处理。

尝试一下,看看是否有帮助。这不仅抓住了整个字符串,而且可以轻松更改。

$UrlList = @'
this is the URL https://stackoverflow.com/&20%
http://stackoverflow.com
http://www.SomeSite.com this is our main site
http://www.SomeSite.com
ftp://www.somesite.com
ftp://somesite.com
ftp\SomeSite.com
If you want the file go there: file://SomeSite.com
'@ 

[RegEx]::Matches($UrlList, '(ftp:|ftp|http:|https:|file:)(//.([^\s]+)|\\.([^\s]+))').value

https://stackoverflow.com/&20%
http://stackoverflow.com
http://www.SomeSite.com
http://www.SomeSite.com
ftp://www.somesite.com
ftp://somesite.com
ftp\SomeSite.com
file://SomeSite.com