我还没有完全理解$_
中ForEach-Object
循环中Import-Csv
的值$_
。所以我不确定如何合并两个csv文件之间的记录(我可以找到匹配,只是不确定如何合并记录)。
我考虑过将两个csv文件中的$_
作为字符串分配,然后将两个字符串合并为一行,但我确定如何将@{column1 = value1_1; column2 = value2_1}
转换为字符串而不用它变成了像"value1_1","value2_1"
vs $MergedFileTable += $userInputRecord + $record
这样奇怪的东西。我甚至不确定这是否是最好的解决方案,或者PowerShell是否支持合并两个记录(看起来像哈希表?)。
Function Main
{
$userInputCsvTable = Import-Csv 'C:\Scripts\Tests\TestCurrent\UserInput.csv'
$userInputCsvFile = 'C:\Scripts\Tests\TestCurrent\UserInput.csv'
$DatabaseCsvTable = Import-Csv 'C:\Scripts\Tests\TestCurrent\DatabaseReport.csv'
$DatabaseCsvFile = 'C:\Scripts\Tests\TestCurrent\DatabaseReport.csv'
$DatabaseCustomCsvFile = 'C:\Scripts\Tests\TestCurrent\DatabaseReport_Custom.csv'
$MergedFileTable = @{}
$MergedFile = 'C:\Scripts\Tests\TestCurrent\Merged.csv'
[String]$UserInputColumnIDsTitle = 'IDs'
[String]$UserInputColumnLastNameTitle = 'Last Name'
[String]$UserInputColumnFirstNameTitle = 'First'
[String]$DatabaseColumnUserLoginTitle = 'User Login'
[String]$DatabaseColumnFirstNameTitle = 'First Name'
[String]$DatabaseColumnLastNameTitle = 'Last Name'
[String]$DatabaseColumnUserStatusTitle = 'User Status'
[String]$DatabaseColumnDomain1Title = 'Domain1'
[String]$DatabaseColumnDomain2Title = 'Domain2'
[String]$DatabaseColumnDomain3Title = 'Domain3'
[String]$DatabaseColumnDomain4Title = 'Domain4'
[String]$DatabaseColumnDomain5Title = 'Domain5'
[String]$DatabaseColumnDomain6Title = 'Domain6'
Write-Host "Gathering one list of first and last names from file: $userInputCsvFile..."
[String[]]$userInputColumnsLastAndFirstNames = @()
$userInputCsvTable | ForEach-Object {
$userInputColumnsLastAndFirstNames += $_.$UserInputColumnLastNameTitle + ' ' + $_.$UserInputColumnFirstNameTitle
}
Write-Host "Complete."
Write-Host "Getting Number of Names..."
[Int]$numOfNames = $userInputColumnsLastAndFirstNames.Length
Write-Host "Complete."
Write-Host "Creating new Database table only with first and last names in list (side-by-side)..."
[Int]$DisplayCounter = 0
$DatabaseCsvTable | ForEach-Object {
$DisplayCounter++
If ($DisplayCounter % 10000 -eq 0)
{
Write-Host 'On record ' $DisplayCounter
}
If ($userInputColumnsLastAndFirstNames -contains ($_.$DatabaseColumnLastNameTitle + ' ' + $_.$DatabaseColumnFirstNameTitle))
{
$_
}
} | Export-Csv $DatabaseCustomCsvFile -NoTypeInformation -Force
Write-Host "Complete."
Write-Host "Creating merged file with duplicate records found..."
$DatabaseCustomCsvTable = Import-Csv $DatabaseCustomCsvFile
$userInputCsvTable | ForEach-Object {
$userInputRecord = $_
$firstName = $_.$UserInputColumnFirstNameTitle
$lastName = $_.$UserInputColumnLastNameTitle
$matchedTable = $DatabaseCustomCsvTable | Where-Object {($firstName -eq $_.$DatabaseColumnFirstNameTitle) -and ($lastName -eq $_.$DatabaseColumnLastNameTitle)}
If ($matchedTable)
{
ForEach ($record in $matchedTable)
{
$MergedFileTable += $userInputRecord + $record
}
}
}
Write-Host "Complete."
Write-Host "Exporting "
$MergedFileTable | Export-Csv $MergedFile -Force -NoTypeInformation
}
Main
)出现以下代码错误:Method invocation failed because [System.Management.Automation.PSObject] doesn't contain a method named 'op_Addition'.
At C:\Scripts\Tests\TestCurrent\test1.ps1:62 char:5
+ $MergedFileTable += $userInputRecord + $record
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (op_Addition:String) [], RuntimeException
+ FullyQualifiedErrorId : MethodNotFound
for e in prange(num,nogil=True,num_threads=4):
.
.
.
#pragma omp critical
cs
.
.
.
end_for
任何想法如何做到这一点?
答案 0 :(得分:0)
我想真的没有简单的方法可以做到这一点......
# create 2 merged file contents
Write-Host "Searching DB (with each search found)
and creating 2 merged record file contents in memory (one with duplicate records) (one with single records)..."
$UserInputSearchNum = 1
$searchPrintOutputNum = 5
[String]$MergedFileColumnsLine = '"' + $UserInputColumnIDsTitle + '","' + $UserInputColumnFirstNameTitle + '","' + $UserInputColumnLastNameTitle + '","' + $UserInputColumnQLActiveTitle + '","' + $DbColumnUserLoginTitle + '","' + $DbColumnFirstNameTitle + '","' + $DbColumnLastNameTitle + '","' + $DbColumnUserStatusTitle + '","' + $DbColumnAdD1Title + '","' + $DbColumnAdD2Title + '","' + $DbColumnAdD3Title + '","' + $DbColumnAdD4Title + '","' + $DbColumnAdD5Title + '","' + $DbColumnAdD6Title + '"'
$MergedFileContentsSingleRecords = @($MergedFileColumnsLine)
$MergedFileContentsDuplicateRecords = @($MergedFileColumnsLine)
Import-Csv $LocalUserInputCsvReportWithPath | ForEach-Object {
If ($UserInputSearchNum % $searchPrintOutputNum -eq 0) # Lets user know which search it's on so user can estimate eta on completion
{
Write-Host "$UserInputSearchNum/$UserInputNumOfSearches searches complete"
}
[String]$userInputRecord = '"' + $_.$UserInputColumnIDsTitle + '","' + $_.$UserInputColumnFirstNameTitle + '","' + $_.$UserInputColumnLastNameTitle + '","' + $_.$UserInputColumnQLActiveTitle + '"'
$firstAndLastName = $_.$UserInputColumnFirstNameTitle + ' ' + $_.$UserInputColumnLastNameTitle
$matchedDbRecordTable = Import-Csv $LocalDbCustomReportFileWithPath | Where-Object {$firstAndLastName -eq ($_.$DbColumnFirstNameTitle + ' ' + $_.$DbColumnLastNameTitle)}
If ($matchedDbRecordTable)
{
If ($matchedDbRecordTable.length -gt 1)
{
$matchedDbRecordTable | ForEach-Object {
$MergedFileContentsDuplicateRecords += $userInputRecord + ',"' + $_.$DbColumnUserLoginTitle + '","' + $_.$DbColumnFirstNameTitle + '","' + $_.$DbColumnLastNameTitle + '","' + $_.$DbColumnUserStatusTitle + '","' + $_.$DbColumnAdD1Title + '","' + $_.$DbColumnAdD2Title + '","' + $_.$DbColumnAdD3Title + '","' + $_.$DbColumnAdD4Title + '","' + $_.$DbColumnAdD5Title + '","' + $_.$DbColumnAdD6Title + '"'
} | Out-Null
} Else {
$matchedDbRecordTable | ForEach-Object {
$MergedFileContentsSingleRecords += $userInputRecord + ',"' + $_.$DbColumnUserLoginTitle + '","' + $_.$DbColumnFirstNameTitle + '","' + $_.$DbColumnLastNameTitle + '","' + $_.$DbColumnUserStatusTitle + '","' + $_.$DbColumnAdD1Title + '","' + $_.$DbColumnAdD2Title + '","' + $_.$DbColumnAdD3Title + '","' + $_.$DbColumnAdD4Title + '","' + $_.$DbColumnAdD5Title + '","' + $_.$DbColumnAdD6Title + '"'
} | Out-Null
}
} Else {
$MergedFileContentsSingleRecords += $userInputRecord + ',"Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found","Not Found"'
}
$UserInputSearchNum++
} | Out-Null
Write-Host "Complete."
Write-Host ''
以下是完整的测试脚本:https://gist.github.com/vardahoth/24649da587afc3d838234ab2398c8f80
以下是用户输入文件(用于测试):https://gist.github.com/vardahoth/864c90325d4e4dc4cf6921bf3b352ff5
这是db输入文件(用于测试):https://gist.github.com/vardahoth/81beb4340b9a4d765ebe7b6361d12e30
这只需要几秒钟的测试就可以运行。但是,将两个文件合并为33000 * 34000记录需要大约60小时(同时适应两个文件中的重复项)。
如果有人知道更快地支持合并方法的库,那将会有很大的帮助。感谢。