HTMLagilityPack与Powershell,Windows身份验证相结合

时间:2017-10-25 09:18:32

标签: powershell html-agility-pack

所以我有一个名为lansweeper的工具。它在本地服务器上运行。现在我想从中抓取一个页面,但它使用Windows身份验证。 我使用Powershell作为脚本语言。 我主要使用HTMLAgilityPack来抓取。但我从来没有抓过使用Windows身份验证的页面。

有谁知道我如何通过我的凭证?那么它在某些凭据下打开页面? (比如我的管理员帐户而不是我的正常帐户)。 (是的,我可以将我的普通用户添加到Lansweeper中允许的用户,但这不是我想要使用的解决方案。)

我尝试过以下但是没有用。

[Reflection.Assembly]::LoadFile("C:\Scraping\HtmlAgilityPack\lib\Net45\HtmlAgilityPack.dll”)
[HtmlAgilityPack.HtmlWeb]$web = @{}
$webclient = new-object System.Net.WebClient
$username = "user"
$password = "passw0rd-"
$domain = "mydomain"
$webclient.Credentials = new-object System.Net.NetworkCredential($username, $password, $domain)
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load("http://lansweeper:81/user.aspx?username=sam&userdomain=mydomain","","",$webclient.Credentials) 
[HtmlAgilityPack.HtmlNodeCollection]$nodes = $doc.DocumentNode.SelectNodes("//body")

我一直在研究这些功能,并遇到了两种可能性:

TypeName   : HtmlAgilityPack.HtmlWeb
Name       : Load
HtmlAgilityPack.HtmlDocument Load(string url), 
HtmlAgilityPack.HtmlDocument Load(string url, string proxyHost, int proxyPort, string userId, string password), 
HtmlAgilityPack.HtmlDocument Load(string url, string method), 
HtmlAgilityPack.HtmlDocument Load(string url, string method, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials)

Name       : Get
MemberType : Method
void Get(string url, string path), 
void Get(string url, string path, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials), 
void Get(string url, string path, string method), 
void Get(string url, string path, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials, string method)

但是我无法让其中一个人工作。有人用Powershell做过这个吗?

1 个答案:

答案 0 :(得分:2)

我发现了如何做到这一点:我希望将来可以帮助某人。 这一点并不简单,但一旦你看到它就很容易。

[Reflection.Assembly]::LoadFile("C:\temp\HtmlAgilityPack\lib\Net45\HtmlAgilityPack.dll") | Out-Null
[HtmlAgilityPack.HtmlWeb]$web = @{}
$url = "http://lansweeper:81/user.aspx?username=sam&userdomain=mydomain"
$webclient = new-object System.Net.WebClient

    $cred = new-object System.Net.NetworkCredential
    $defaultCredentials =  $cred.UseDefaultCredentials

$proxyAddr = (get-itemproperty 'HKCU:\Software\Microsoft\Windows\CurrentVersion\Internet Settings').ProxyServer
$proxy = new-object System.Net.WebProxy
$proxy.Address = $proxyAddr
$proxy.useDefaultCredentials = $true 
$proxy

[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url,"GET","$proxy",$defaultCredentials ) 
[HtmlAgilityPack.HtmlNodeCollection]$nodes = $doc.DocumentNode.SelectNodes("//html[1]/body[1]")

$nodes

<# USER RESOURCES
https://msdn.microsoft.com/en-us/library/system.net.webclient.usedefaultcredentials(v=vs.110).aspx
https://forums.asp.net/t/2027997.aspx?HtmlAgilityPack+Stuck+trying+to+understand+HtmlWeb+Load+NetworkCredential
https://msdn.microsoft.com/en-us/library/system.net.webclient.usedefaultcredentials.aspx
https://stackoverflow.com/questions/571429/powershell-web-requests-and-proxies

TypeName   : HtmlAgilityPack.HtmlWeb
Name       : Load
HtmlAgilityPack.HtmlDocument Load(string url, string proxyHost, int proxyPort, string userId, string password), 
HtmlAgilityPack.HtmlDocument Load(string url, string method, System.Net.WebProxy proxy, System.Net.NetworkCredential credentials)
#>