我有一个小功能,似乎导致我的服务器内存不足,但我无法解决原因。我读了一个大的CSV(300万行),然后在代码块中我试图a)复制文件,b)通过调用start-process解压缩它:
$c = 0;
foreach ($line in $csv) {
$c = $c+1
Write-Host "Processing item $c of $total"
$folder = $line.destination.substring( 0, $line.destination.lastindexof("\") )
if (Test-Path $folder) {
Write-Debug "Folder Exists; "
} Else {
Write-Debug "Folder being created; "
mkdir $folder
}
if (Test-Path $line.original) {
Write-Debug "File to be processed; "
Write-Debug $line.original
Write-Debug $line.destination
try
{
copy-item $line.original $line.destination
}
catch [System.ArgumentException]
{
Write-Warning "ERROR: Could not copy"
Write-Warning "Check file, FROM: $($line.original)"
Write-Warning "Check file, TO : $($line.destination)"
}
$arguments = "-d", "-f", "`"$($line.destination)`""
try
{
start-process -FilePath $command -ArgumentList $arguments -RedirectStandardOutput stdout.txt -RedirectStandardError stderr.txt -WindowStyle Hidden
}
catch [System.ArgumentException]
{
Write-Warning "ERROR: Could not unzip"
Write-Warning "Check file, FROM: $($line.original)"
Write-Warning "Check file, TO : $($line.destination)"
}
} Else {
Write-Warning "ERROR: File not found, line $c"
Write-Warning "Check file, FROM: $($line.original)"
Write-Warning "Check file, TO : $($line.destination)"
}
}
}
在300万左右的220,000左右,我得到了一些错误,我归咎于RAM,但也可能不是,谷歌到目前为止还没有帮助我解决这些问题所以我想知道如果它是脚本中的内存泄漏(即使powershell进程不会随着时间的推移而增长)。
Write-Host : The Win32 internal error "Insufficient quota to complete the requested
service" 0x5AD occurred while getting console output buffer information. Contact
Microsoft Customer Support Services.
start-process : This command cannot be run due to the error: Only part of a
ReadProcessMemory or WriteProcessMemory request was completed.
out-lineoutput : The Win32 internal error "Insufficient quota to complete the requested
service" 0x5AD occurred while getting console output buffer information. Contact Microsoft
Customer Support Services.
答案 0 :(得分:2)
当你像在这里一样使用foreach时,它会将csv的全部内容保存在内存中,对于300万行的csv来说,这将是非常重要的。这是管道可以帮助你的地方。
您应该利用管道来传输数据,这样可以降低内存消耗。为了让您开始考虑以下事项:
Import-Csv -path 'c:\temp\input.csv' | Foreach-Object {
# code for stuff you want to do for each csv line
}
此代码将开始读取csv,一行一行,并通过管道将每一行传递给下一个命令。该行然后命中Foreach-Object,这意味着它将执行来自管道的每个输入的脚本块中的任何代码。如果需要,您可以通过pipline进一步发送数据(例如更新文件等)。
要知道的主要事情是它将流式传输数据而不是一次性将所有内容读入内存,就像在脚本中一样。有时这是可取的,因为如果你有RAM备用它通常会更快,但在你的情况下你应该牺牲一些速度,这样你的内存就不会用完。
希望你也会得到关于这个问题的其他建议,但在此期间,如果需要,请阅读管道,并尝试重构你的脚本以利用它,看看它是否有帮助。
UPDATE!我试图使用管道重写我可用的代码部分。您应该能够将其复制/粘贴到您的脚本中(请记住首先获取备份副本!)
$pathCSV = 'insert\path\to\csvfile.csv'
$command = 'something'
Import-Csv -Path $pathCSV | ForEach-Object {
try {
$line = $_
$folder = $line.destination.substring( 0, $line.destination.lastindexof('\') )
if (-not(Test-Path $folder)) {
New-Item -Path $folder -ItemType 'Directory'
Write-Verbose "$folder created"
}
else {
Write-Verbose "$folder already exists"
}
if (Test-Path $line.original) {
Write-Verbose "File to be processed: $($line.original) [original] - $($line.destination) [destination]"
Copy-Item $line.original $line.destination
$arguments = '-d', '-f', "`"$($line.destination)`""
start-process -FilePath $command -ArgumentList $arguments -RedirectStandardOutput stdout.txt -RedirectStandardError stderr.txt -WindowStyle Hidden
# run garbage collection to try to free up some memory, if this slows down
# the script too much, comment these lines out
[gc]::Collect()
[gc]::WaitForPendingFinalizers()
}
else {
Write-Warning "File not found: $($line.original) [original] - $($line.destination) [destination]"
}
}
catch {
Write-Warning "At line:$($_.InvocationInfo.ScriptLineNumber) char:$($_.InvocationInfo.OffsetInLine) Command:$($_.InvocationInfo.InvocationName), Exception: '$($_.Exception.Message.Trim())'"
}
}
请记住填写CSV的路径并将$命令定义为它。
希望这会有效,或者至少可以为你提供进一步的工作。