合并两个CSV文件

时间:2016-07-19 06:42:33

标签: csv powershell powershell-v2.0 export-to-csv

我想合并两个CSV文件,这些文件具有相同的标题到一个CSV文件中。 我有两个这样的文件,如下所示,DevData.csvProdData.csv具有相同的cfname和不同的ID s

    ID                   cfname
    -------------------- -----------------------------------
                   10201 Risk ID
                   10202 Issue ID
                   10203 Dependency ID
                   10204 Server ID
                   10205 Parent Application ID
                   10206 Application Service ID
                   10207 Application Supportability
                   10208 Application Compatibility
                   10300 Application Status
                   10301 Contact ID Type 2
                   10302 Contact ID Type 3
                   10303 Contact ID Type 4
                   10304 Business Service Manager
                   10308 Server Location Name:
                   10309 Rack Position:
                   10310 Rack Number:
                   10311 Data Centre
                   10312 Server Group
(14 rows affected)

我想以下列格式创建新的CSV:

DevID                ProdID cfname
-------------------- ------ -----------------------------------
               10201 201    Risk ID
               10202 202    Issue ID
               10203 203    Dependency ID
               10204 204    Server ID
               10205 205    Parent Application ID
               10206 206    Application Service ID
               10207 207    Application Supportability
               10208 208    Application Compatibility
               10300 209    Application Status
               10301 210    Contact ID Type 2
               10302 211    Contact ID Type 3
               10303 212    Contact ID Type 4
               10304 213    Business Service Manager
               10308 214    Server Location Name:

以下是我当前的代码,但它导出第一个文件的数据,下面是下一个文件的数据。

function Merge-CSVFiles {
    [cmdletbinding()]
    param(
        [string[]]$CSVFiles
    )

    $Output = @();
    foreach ($CSV in $CSVFiles) {
        if (Test-Path $CSV) {
            $FileName = [System.IO.Path]::GetFileName($CSV)
            $temp = Import-CSV -Path $CSV |
                    select ID, cfname, ID, cfname, @{Expression={$FileName}}
            $Output += $temp
        } else {
            Write-Warning "$CSV : No such file found"
        }
    }
    $Output | Export-Csv -Path $OutputFile -NoTypeInformation
    Write-Output "$OutputFile successfully created"
}

Merge-CSVFiles -CSVFiles "C:\Users\ECSAdmin\Desktop\Proddata.csv", "C:\Users\ECSAdmin\Desktop\Devdata.csv" -OutputFile "C:\Users\ECSAdmin\Desktop\Mergedata.csv"

3 个答案:

答案 0 :(得分:0)

可以在两个集合上执行嵌套的foreach循环,但是两个避免执行时间相对于输入大小呈指数级增长,更好的策略是将一个集合加载到散列表中(使用公共属性cfname作为键),然后循环遍历另一个并连接两个:

$DevData = @'
ID,cfname
10201,Risk ID
10202,Issue ID
10203,Dependency ID
10204,Server ID
10205,Parent Application ID
10206,Application Service ID
10207,Application Supportability
10208,Application Compatibility
10300,Application Status
10301,Contact ID Type 2
10302,Contact ID Type 3
10303,Contact ID Type 4
10304,Business Service Manager
10308,Server Location Name:
10309,Rack Position:
10310,Rack Number:
10311,Data Centre
10312,Server Group
'@ |ConvertFrom-Csv

$ProdData = @'
ID,cfname
201,Risk ID
202,Issue ID
203,Dependency ID
204,Server ID
205,Parent Application ID
206,Application Service ID
207,Application Supportability
208,Application Compatibility
209,Application Status
210,Contact ID Type 2
211,Contact ID Type 3
212,Contact ID Type 4
213,Business Service Manager
214,Server Location Name:
'@ |ConvertFrom-Csv

# throw one set into a hashtable
# we can use this as a lookup table for the other set
$ProdTable = @{}
foreach($line in $ProdData){
    $ProdTable[$line.cfname] = $line.ID
}

# Output the DevData with the appropriate ProdData value
$DevData |Select-Object @{Label='DevID';Expression={$_.ID}},@{Label='ProdID';Expression={$ProdTable[$_.cfname]}},cfname |Export-Csv .\new.csv -NoTypeInformation

答案 1 :(得分:0)

您可以尝试这个简单的命令管道:

Out-file -FilePath '.\csv3.csv' -InputObject "ProdID,ID,cfname"; ForEach($CFName In $Csv1) { $Csv2.Where({$_.cfname -eq $CFName.cfname}) | %{ "$($_.ProdID),$($CFName.ID),$($_.cfName)" } | Out-File .\csv3.csv -Append}

我假设Csv1.csv是第一个带有ID和cfname列的文件,第二个文件Csv2.csv有ProdID和cfname列。这将产生具有合并内容的第三个文件csv3.csv

答案 2 :(得分:0)

由于您使用sqlcmd从SQL Server导出数据,因此需要添加参数-W-s","以使命令创建实际的CSV输出:

sqlcmd -S server -d db -E -Q "query" -W -s"," -o output.csv

获得实际的CSV文件后,您可以像这样处理它们:

# create a hashtable from the second CSV, so you can look up IDs by the
# values in the "cfname" column
$proddata = @{}
Import-Csv 'C:\path\to\ProdData.csv' | ForEach-Object {
  $proddata[$_.cfname] = $_.ID
}

Import-Csv 'C:\path\to\DevData.csv' |
  Select-Object @{n='DevID';e={$_.ID}},
                @{n='ProdID';e={$proddata[$_.cfname}}, cfname |
  Export-Csv 'C:\path\to\merged.csv'

这假设您的ProdData.csv仅包含cfname中显示的DevData.csv值,并且cfname值至少在ProdData.csv中是唯一的。双向合并更复杂,因为您需要检查$proddata中的DevData.csv中的按键是否存在,并相应地附加它们。如果您的cfname值不唯一,则无法对齐记录。