PowerShell如何通过ID将列从另一个CSV文件添加到CSV文件?

时间:2019-09-17 16:34:38

标签: powershell csv

我有两个CSV文件。第一个文件可能包含不同数量的行。每行都有ID。在这种情况下-place_id。 我想从第二个开始在此文件中添加列。

"place_id";"osm_type";"osm_id";"place_rank";"boundingbox";"lat";"lon";"display_name";"class";"type";"importance";"icon";"postcode";"city";"town";"village";"hamlet";"allotments";"neighbourhood";"suburb";"city_district";"state_district";"building";"address100";"address26";"address27";"address29";"county";"state";"country";"country_code";"place";"population";"wikidata";"wikipedia";"name";"official_name"
"100073243";"way";"108738557";"19";"56.1330951,56.1377776,35.7857419,35.7966764";"56.1354281";"35.7903646";"Bolshoe Syrkovo, Volokolamskij gorodskoj okrug, Moskovskaya oblast, CFO, RF";"place";"hamlet";"0.45401456808503";"https://nominatim.openstreetmap.org/images/mapicons/poi_place_village.p.20.png";"";"";"";"";"Bolshoe Syrkovo";"";"";"";"";"";"";"";"";"";"";"Volokolamskij gorodskoj okrug";"Moskovskaya oblast";"RF";"ru";"hamlet";"19";"Q4092451";"ru:Bolshoe Syrkovo";"Bolshoe Syrkovo";""
"100073263";"way";"108729132";"19";"56.1542386,56.156816,36.3303962,36.3383278";"56.15552975";"36.3343542260811";"Kondratovo, Volokolamskij gorodskoj okrug, Moskovskaya oblast, CFO, RF";"place";"hamlet";"0.385";"https://nominatim.openstreetmap.org/images/mapicons/poi_place_village.p.20.png";"";"";"";"";"Kondratovo";"";"";"";"";"";"";"";"";"";"";"Volokolamskij gorodskoj okrug";"Moskovskaya oblast";"RF";"ru";"";"";"";"";"Kondratovo";""
"100073265";"way";"108738571";"19";"56.009293,56.0205996,36.2239313,36.2390323";"56.015194";"36.2290485";"Gryady, Volokolamskij gorodskoj okrug, Moskovskaya oblast, CFO, Rossiya";"place";"village";"0.36089190172262";"https://nominatim.openstreetmap.org/images/mapicons/poi_place_village.p.20.png";"";"";"";"Gryady";"";"";"";"";"";"";"";"";"";"";"";"Volokolamskij gorodskoj okrug";"Moskovskaya oblast";"Rossiya";"ru";"village";"841";"Q4151063";"ru:Gryady (Moskovskaya oblast)";"Gryady";""

第二个文件。此文件内容包含完整的地理坐标基础。每行都有place_id列,该列与第一个文件中的place_id行匹配。 第二个文件-我想从geojson列中复制字符串,并通过place_id添加到第一个文件。 该文件比第一个大。 (第一个大约5 Mb,第二个大约50 Mb。)

"place_id";"osm_id";"geojson"
"100059669";"111492916";"{""type"":""Polygon"",""coordinates"":[[[37.6221208,56.0629951],[37.6227846,56.0617338],[37.6235702,56.0612884],[37.6241549,56.0610708],[37.625994,56.06052],[37.627407,56.0616613],[37.6250022,56.0628003],[37.624107,56.0632933],[37.6244298,56.06364],[37.6240209,56.0640423],[37.6238138,56.0639879],[37.6238869,56.0635391],[37.6236798,56.0634711],[37.6221208,56.0629951]]]}"
"100066930";"108048163";"{""type"":""Polygon"",""coordinates"":[[[37.488797,54.9187857],[37.489145,54.9178087],[37.4916813,54.9161087],[37.4915675,54.914397],[37.4923037,54.9141008],[37.4938964,54.9139008],[37.4946333,54.9135329],[37.4950753,54.9135998],[37.4958462,54.9135723],[37.4961204,54.9133221],[37.4963465,54.9127529],[37.4976451,54.912619],[37.49836,54.9121783],[37.4984597,54.9124224],[37.4989822,54.9128384],[37.4986734,54.9131341],[37.4984106,54.9135755],[37.4984278,54.9141491],[37.4988218,54.9145627],[37.5001752,54.9148064],[37.5005392,54.9147547],[37.5005076,54.9157027],[37.5005411,54.9169758],[37.5003203,54.9183989],[37.500086,54.9191066],[37.4999331,54.919399],[37.4992204,54.9195132],[37.4991362,54.9199856],[37.4977175,54.9199433],[37.497684,54.9204933],[37.4959374,54.9204279],[37.4937625,54.9202703],[37.493187,54.9202895],[37.4925111,54.9202126],[37.4917951,54.9202741],[37.4903496,54.9202356],[37.4899949,54.920301],[37.4891785,54.9207395],[37.488884,54.9204202],[37.4888506,54.9200203],[37.488797,54.9194703],[37.488797,54.9187857]]]}"
"100073243";"108738557";"{""type"":""Polygon"",""coordinates"":[[[35.7857419,56.1346341],[35.7870207,56.1330951],[35.7960034,56.1354737],[35.7964486,56.136383],[35.7966764,56.1371216],[35.796451,56.1375923],[35.7940459,56.1377776],[35.7872053,56.1362698],[35.7860927,56.135251],[35.7857419,56.1346341]]]}"
"100073263";"108729132";"{""type"":""Polygon"",""coordinates"":[[[36.3303962,56.1556187],[36.3327609,56.1549359],[36.3332297,56.1553915],[36.3371409,56.1542386],[36.3383278,56.1554408],[36.3356724,56.1561707],[36.3352194,56.1557934],[36.3314321,56.156816],[36.3303962,56.1556187]]]}"
"100073265";"108738571";"{""type"":""Polygon"",""coordinates"":[[[36.2239313,56.0144832],[36.2261932,56.011495],[36.2284626,56.0095073],[36.2321529,56.009293],[36.2331509,56.0117168],[36.2341666,56.0135926],[36.2390323,56.0144832],[36.2385065,56.0167726],[36.2357356,56.0167906],[36.2334461,56.0197924],[36.2263531,56.0205996],[36.2251121,56.020199],[36.2239313,56.0144832]]]}"
"100075231";"110068197";"{""type"":""Polygon"",""coordinates"":[[[38.2935489,54.7729509],[38.2939625,54.7719488],[38.2950008,54.7717047],[38.2966022,54.7717389],[38.2968603,54.7712015],[38.2960165,54.7691138],[38.2982481,54.7689281],[38.3005051,54.7687673],[38.3025611,54.7678635],[38.3045996,54.7650658],[38.305887,54.7649297],[38.3081401,54.7650906],[38.3085907,54.7656105],[38.3078585,54.7664648],[38.3092498,54.7671973],[38.3097709,54.7679502],[38.3082259,54.7681977],[38.3082688,54.7688538],[38.3074105,54.7691138],[38.3073891,54.7696708],[38.3081616,54.7712304],[38.3070458,54.7719978],[38.3052433,54.7713294],[38.3036984,54.7705991],[38.3024753,54.7723691],[38.2999862,54.7725548],[38.2993425,54.7731984],[38.2958449,54.7734459],[38.2942355,54.77404],[38.2935489,54.7729509]]]}"
"100083347";"108773218";"{""type"":""Polygon"",""coordinates"":[[[37.363052,55.2929074],[37.3641893,55.2923393],[37.3680087,55.2950118],[37.3709592,55.2961602],[37.3732015,55.2966977],[37.3755511,55.2974185],[37.3748001,55.2984019],[37.3730727,55.2976933],[37.3689743,55.2966549],[37.3660883,55.2951706],[37.363052,55.2929074]]]}"
"100088132";"108787848";"{""type"":""Polygon"",""coordinates"":[[[36.3930954,56.1869244],[36.3949447,56.1858475],[36.4025567,56.1881928],[36.4037609,56.1903944],[36.4019117,56.1907295],[36.3982131,56.1894851],[36.3930954,56.1869244]]]}"
"100088151";"108787862";"{""type"":""Polygon"",""coordinates"":[[[36.4786893,56.0795892],[36.4788543,56.0782741],[36.4790085,56.0775791],[36.4790382,56.0775181],[36.4791316,56.0774071],[36.4790562,56.0772801],[36.4790339,56.0770308],[36.4814648,56.0770996],[36.48379,56.0816509],[36.4817819,56.0817197],[36.478929,56.0802664],[36.4786893,56.0795892]]]}"

我想这对于一个知识渊博的程序员来说并不难。我不是那样的人

我尝试了很多代码。但是没有人没有为我工作。 在下面,我将列出我尝试过的代码。我不擅长编程。也许我不明白某些代码的含义。

请帮助我解决这个问题。

###
Get-ChildItem -Filter .\comb\*.csv | Select-Object -ExpandProperty FullName | Import-Csv | Export-Csv .\combinedcsvs.csv -NoTypeInformation -Append
###

###
$DevData = (Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8)[1..10]
$ProdData = (Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8)[1..10]
# throw one set into a hashtable
# we can use this as a lookup table for the other set
$ProdTable = @{}
foreach($line in $ProdData){
    $ProdTable[$line.place_id] = $line.ID
}
# Output the DevData with the appropriate ProdData value
$DevData | Select-Object @{Label='DevID';Expression={$_.ID}},@{Label='ProdID';Expression={$ProdTable[$_.place_id]}},place_id | Export-Csv .\new2.csv -NoTypeInformation -Delimiter ";" -Encoding:UTF8
###

###
$f1=(Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8 -header "place_id","osm_type","osm_id","place_rank","boundingbox","lat","lon","display_name","class","type","importance","icon","postcode","city","town","village","hamlet","allotments","neighbourhood","suburb","city_district","state_district","building","address100","address26","address27","address29","county","state","country","country_code","place","population","wikidata","wikipedia","name","official_name")[1..1]
$f1
$f2=(Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8 -header samname,"place_id","osm_id","geojson")[1..1]
$f1|
   %{
      $geojson=$_.geojson
      $m=$f2|?{$_.geojson -eq $geojson}
      $_.place_id=$m.place_id
    }
$f1
###


###
#Make an empty hash table for the first file
$File1Values = @{}
#Import the first file and save the rows in the hash table indexed on "place_id"
Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8 | ForEach-Object {
  $File1Values.Add($_.place_id, $_)
}
#Import the second file and make a custom object with properties from both files
Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8 | ForEach-Object {
  [PsCustomObject]@{
    ABC = $File1Values[$_.KeyColumn].ABC;
    DEF = $File1Values[$_.KeyColumn].DEF;
    UVW = $_.UVW;
    XYZ = $_.XYZ;
  }
} | Export-Csv -Path c:\OutFile.csv
###

###
$Poproperties = @(
'worker_name',
'requester_name',
@{E={$Lookup_Hash.($_.field_834)};L='field_834'},
@{E={$Lookup_Hash.($_.field_835)};L='field_835'},
@{E={$Lookup_Hash.($_.field_836)};L='field_836'},
@{E={$Lookup_Hash.($_.field_837};L='field_837'},
@{E={$Lookup_Hash.($_.field_838)};L='field_838'}
)
Import-Csv -Path C:\S_FilePath | Select-Object -Property $Poproperties
###


###
$Lookup_Hash = Import-Csv ".\pars_full_4_only_geo.csv" -Delimiter ";" -Encoding:UTF8 | ForEach-Object -Process { $_.place_id = $_.name }
$S_File = Import-Csv ".\pars_full_4_without_geo.csv" -Delimiter ";" -Encoding:UTF8 | Select-Object -Property *,@{E={$Lookup_Hash.($_.place_id)};L='place_id'} | Export-Csv ".\pars_full_5_combine_geo.csv" -NoTypeInformation -Delimiter ";" -Encoding:UTF8
###

2 个答案:

答案 0 :(得分:1)

这是我创建的一个有效示例,展示了完成该操作的一种方式

我创建了两个csv文件

file1.csv

"id";"score"
"1";"90"
"3";"100"

file2.csv

"id";"firstname";"lastname"
"1";"steve";"jobs"
"2";"bill";"gates"
"3";"santa";"claus"

然后是我的Powershell脚本test.ps1

$csv1=(import-csv file1.csv -Delimiter ";")
$csv2=(import-csv file2.csv -Delimiter ";")
$csv1 |
    ForEach-Object{
        $row = $_
        if($mtch = $csv2|?{$_.id -eq $row.id}){
                $out = [pscustomobject]@{ id =  $row.id; firstname = $mtch.firstname; lastname = $mtch.lastname; score = $row.score }
                $out
           }
     } | Export-Csv csv3.csv -NoTypeInformation

这就是我运行脚本的方式(与csv文件位于同一目录

powershell -ExecutionPolicy RemoteSigned .\test.ps1

这是结果csv3.csv

"id","firstname","lastname","score"
"1","steve","jobs","90"
"3","santa","claus","100"

答案 1 :(得分:1)

添加适合我任务的代码。我将方案分为3个步骤。

$FileWithOutGeom = Import-Csv ".\FileWithOutGeom.csv" -Delimiter ';' -Encoding UTF8

# step 1. getting all IDs from file without coordinates - sort by ID and select place_id column values. I use join with delimiter '|' to bring data in a suitable format for next step. (for where-obgect -match)
$ID = [string]::Join("|",( $FileWithOutGeom | sort place_id | Select-Object -ExpandProperty 'place_id'))

# step 2. take second file with all coordinates and select from them only those rows which ID have in first file and sort by ID too
$FileWithAllGeom = Import-Csv ".\FileWithAllGeom.csv" -Delimiter ';' -Encoding UTF8 | Where-Object -property place_id -Match $ID | sort place_id

# step 3. take first file without geom and add-member - new column name (geojson) and values for this column from step 2 with add increment for each-object
$FileWithOutGeom | ForEach-Object -Begin {$i = 0} {$_ | Add-Member -MemberType NoteProperty -Name 'geojson' -Value ($FileWithAllGeom)[$i++].geojson -PassThru 
} | Export-Csv ".\CombinedFile.csv" -NoTypeInformation -Delimiter ";" -Encoding:UTF8

在出口处,我有第一个文件末尾带有“ geojson”列的文件。 抱歉,代码可能很糟糕。我从网上发现的碎片中组合了这段代码。 这个方案对于我的任务来说非常有效。文件大约50 mb,另外20 mb-不到10秒即可处理。