Question

我正在尝试使用脚本查询IIS网站上的文件，然后自动下载这些文件。到目前为止，我有这个：

$webclient = New-Object System.Net.webclient
$source = "http://testsite:8005/"
$destination = "C:\users\administrator\desktop\testfolder\"
#The following line returns the links in the webpage
$testcode1 = $webclient.downloadstring($source) -split "<a\s+" | %{ [void]($_ -match "^href=['"]([^'">\s]*)"); $matches[1] }
foreach ($line in $test2) {
    $webclient.downloadfile($source + $line, $destination + $line)
}

我还不擅长PowerShell，而且我遇到了一些错误，但我设法得到一些测试文件，我扔进了我的wwwroot文件夹（web.config文件似乎无法下载，所以我想这就是我的一个错误）。当我尝试将我的$source值更改为我网站上包含一些测试文本文件的子文件夹（例如= http://testsite:8005/subfolder/时，我会收到错误并且根本没有下载。运行我的$testcode1将会在我的子文件夹中给我以下链接：
/subfolder/test2/txt
/
/subfolder/test1.txt
/subfolder/test2.txt
我不知道它为什么两次列出test2文件。我认为我的问题是，因为它返回了子文件夹/文件格式，我收到错误是因为我试图下载$source + $line，这本质上是http://testsite:8005/subfolder/subfolder/test1.txt，但是当我试图解决这个问题时通过添加$root值作为我的网站的根目录并执行foreach($line in $testcode1) { $webclient.downloadfile($root + $line, $destination + $line) }，我仍然会收到错误。
如果你们中的一些高速大师可以帮助向我展示我的方式的错误，我将不胜感激。我希望下载我网站上每个子文件夹中的所有文件，我知道这些文件会涉及一些递归操作，但同样，我目前还没有自己的技能水平。提前谢谢你帮助我！

Answer 1

从网站下载文件的最佳方法是使用

Invoke-WebRequest –Uri $url

一旦你能够获得html，你就可以解析链接的内容。

$result = (((Invoke-WebRequest –Uri $url).Links | Where-Object {$_.href -like “http*”} ) | select href).href

试一试。它比$ webclient = New-Object System.Net.webclient

更简单

Answer 2

这是用两个例子来增加A_N的答案。

将此Stackoverflow问题下载到C:/temp/question.htm。

Invoke-RestMethod -Uri stackoverflow.com/q/19572091/1108891 -OutFile C:/temp/question.htm

将简单文本文档下载到C:/temp/rfc2616.txt。

Invoke-RestMethod -Uri tools.ietf.org/html/rfc2616 -OutFile C:/temp/rfc2616.txt

Answer 3

我会试试这个：

$webclient = New-Object System.Net.webclient
$source = "http://testsite:8005/"
$destination = "C:\users\administrator\desktop\testfolder\"
#The following line returns the links in the webpage
$testcode1 = $webclient.downloadstring($source) -split "<a\s+" | %{ [void]($_ -match  "^href=['"]([^'">\s]*)"); $matches[1] }
foreach ($line in $testcode1) {
    $Destination = "$destination\$line"
    #Create a new directory if it doesn't exist
    if (!(Test-Path $Destination)){
        New-Item $Destination -type directory -Force
    }
    $webclient.downloadfile($source + $line, $destination + $line)
}

我认为你唯一的问题是你从一个新目录中抓取一个新文件，然后将它放入一个尚不存在的文件夹中（我可能会弄错）。

如果无法解决问题，您可以执行其他一些问题排查：

将每一行分别复制到您的PowerShell窗口并将其运行到foreach循环。然后输入包含所有黄金的变量：

    $testcode1

当你将它输入控制台时，它应该准确地吐出那里的内容。然后你可以做这样的其他故障排除：

    "Attempting to copy $Source$line to $Destination$line"

看看它是否应该像往常一样。您可能需要调整一下我的代码。

-Dale Harris

Answer 4

我制作了一个简单的powershell脚本来克隆一个openbsd软件包回购。...可能会工作/可以用其他方式/用例来实现类似的目的。

https://github.com/forgottentq/powershell/blob/master/Package_Repo_Cloner

# Quick and dirty script to clone a package repo. Only tested against OpenBSD.
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$share = "\\172.16.10.99\wmfbshare\obsd_repo\"
$url = "https://ftp3.usa.openbsd.org/pub/OpenBSD/snapshots/packages/amd64/"
cd $share
$packages = Invoke-WebRequest -Uri $url -UseBasicParsing $url
$dlfolder = "\\172.16.10.99\wmfbshare\obsd_repo\"
foreach ($package in $packages.links.href){
    if ((get-item $package -ErrorAction SilentlyContinue)){
        write-host "$package already downloaded"
    } else {
        write-host "Downlading $package"
        wget "$url/$package" -outfile "$dlfolder\$package"
    }
}

在powershell中下载网站文件

4 个答案: