Powershell:从大文件中获取内容(服务器列表)

时间:2016-08-12 09:51:16

标签: csv powershell

我有来自文本文件(serverlist.txt)的100,000个服务器列表

当我一次性运行时,它会破坏我的内存和CPU,并且花费更长的时间(大约3天)来完成DNSlookup的扫描。

我尝试拆分下面包含20k服务器列表的文件,并且可以完成扫描每个文件最多10分钟。

serverlist1.txt
serverlist2.txt
serverlist3.txt
serverlist4.txt
serverlist5.txt
$objContainer = @()
$values = @()
$domains = Get-Content -path "serverlist1.txt"
$named = 0
$timestamp= get-date

$domains | ForEach-Object {
    $domain = $_
    nslookup $domain 2>&1 | ForEach-Object {
        if ($_ -match '^Name:\s*(.*)$') {
            $values += $matches[1]
            $named = 1;
        } elseif (($_ -match '^.*?(\d*\.\d*\.\d*\.\d*)$') -and ($named -eq 1)) {
            $values += $matches[1]
        } elseif ($_ -match '^Aliases:\s*(.*)$') {
            $values += $matches[1]
        }
    }

    $obj = New-Object -TypeName PSObject
    #$obj | Add-Member -MemberType NoteProperty -name 'Domain' -value $domain
    $obj | Add-Member -MemberType NoteProperty -name 'Name' -value $values[0]
    $obj | Add-Member -MemberType NoteProperty -name 'IP Address' -value $values[1]
    $obj | Add-Member -MemberType NoteProperty -name 'Alias' -value $values[2]
    $obj | Add-Member -MemberType NoteProperty -name 'Timestamp' -value $timestamp
    $objContainer += $obj

    $values = @()
    $named = 0
}

Write-Output $objContainer
$objContainer | Export-csv "dnslog_$((Get-Date).ToString('MM-dd-yyyy_hh-mm-ss')).csv" -NoTypeInformation

我的问题是,如何在生成dnslog(datetime)之后立即执行并循环输入文本文件.csv

e.g:

  1. 运行powershell脚本。\ filename.ps1
  2. 从serverlist1.txt输入
  3. 输出dnslog(日期时间).csv
  4. 从serverlist2.txt输入
  5. 输出dnslog(日期时间).csv
  6. 从serverlist3.txt输入
  7. 输出dnslog(日期时间).csv
  8. 从serverlist4.txt输入
  9. 输出dnslog(日期时间).csv
  10. 从serverlist5.txt输入
  11. 输出dnslog(日期时间).csv
  12. 完成!

    如果我有超过5个文本文件列表,它将继续从输入文件循环直到完成。

2 个答案:

答案 0 :(得分:0)

您应该考虑运行此并行批处理作业。你有没有试过这样做?

您可以通过删除所有对内存的提交来处理RAM破坏问题(变量赋值和使用+ =的数组重写)。

$timestamp = get-date

Get-Content -path "serverlist1.txt" | ForEach-Object {
    $domain = $_

    # You can clear this here.
    $values = @()
    $named = 0

    # There are potentially better options than nslookup.
    # Needs a bit of care to understand what's an alias here though.
    # [System.Net.Dns]::GetHostEntry($domain)
    # And if you don't like that, quite a few of us have written equivalent tools in PowerShell.
    nslookup $domain 2>&1 | ForEach-Object {
        if ($_ -match '^Name:\s*(.*)$') {
            $values += $matches[1]
            $named = 1;
        } elseif (($_ -match '^.*?(\d*\.\d*\.\d*\.\d*)$') -and ($named -eq 1)) {
            $values += $matches[1]
        } elseif ($_ -match '^Aliases:\s*(.*)$') {
            $values += $matches[1]
        }
    }

    # Leave the output object in the output pipeline
    # If you're running PowerShell 3 or better:
    [PSCustomObject]@{
        Domain       = $domain
        Name         = $values[0]
        'IP Address' = $values[1]
        Alias        = $values[2]
        TimeStamp    = $timestamp
    }
    # PowerShell 2 is less flexible. This or Select-Object.
    #$obj = New-Object -TypeName PSObject
    ##$obj | Add-Member -MemberType NoteProperty -name 'Domain' -value $domain
    #$obj | Add-Member -MemberType NoteProperty -name 'Name' -value $values[0]
    #$obj | Add-Member -MemberType NoteProperty -name 'IP Address' -value $values[1]
    #$obj | Add-Member -MemberType NoteProperty -name 'Alias' -value $values[2]
    #$obj | Add-Member -MemberType NoteProperty -name 'Timestamp' -value $timestamp
    # To leave this in the output pipeline, uncomment this
    # $obj

    # No version of PowerShell needs you to do this. It's a good way to ramp up memory usage 
    # for large data sets.
    # $objContainer += $obj
} | Export-Csv "dnslog_$(Get-Date -Format 'MM-dd-yyyy_hh-mm-ss').csv" -NoTypeInformation

答案 1 :(得分:0)

添加到Chris的答案我还会向Get-Content添加一个ReadCount标志,如下所示:

onClick

这将节省必须将整个文件读入内存。