合并csv文件但删除最后一行和多列

时间:2017-06-19 23:00:42

标签: csv batch-file cmd merge

我最初的问题是将多个csv合并为一个,删除前几行(在我的情况下为5)和所有空行。我能够找到以下解决方案:

@echo off
>Output.csv(
    for %%f in (*.csv) do (
        for /f "delims=" %%l in ('more +5 %%f') do (
            echo %%f,%%l
        )
    )
)

我还没有能够重写代码来擦除每个文件的最后一行,并从output.csv文件中删除几列(或单个列)。

这是csv文件的示例:

Timecard Report
06/12/2017 - 06/12/2017
Departments : All_Departments-TOTAL HOURS

EMPLOYEE NAME,EMPLOYEE PAYROLL ID,FIRST NAME,LAST NAME,DEPARTMENT NAME,REG,REG Pay,OT1 Hours,OT1 Pay,OT2 Hours,OT2 Pay,VAC Hours,VAC Pay,HOL Hours,HOL Pay,SIC Hours,SIC Pay,OTH Hours,OTH Pay,TOTAL Hours,Total Pay 
Oc Br,999,Oc,Br,Fulfillment,8.00,114.8,.53,11.41,,,,,,,,,,,8.53,126.21 
Brat Hat,3423,Brat,Hat,Logistics Admin,5.42,75.88,,,,,,,,,,,,,5.42,75.88 
Tod Vindo,,Tod,Vindo,Logistics Admin,8.00,128,1.18,28.32,,,,,,,,,,,9.18,156.32 

TOTAL,,,,,73.53,1143.25,3.30,73.23,,,,,,,,,,,76.83,1216.48 

有人有想法吗?

2 个答案:

答案 0 :(得分:0)

IMO合并通常意味着添加列,而不是追加/连接到末尾。您可以使用"skip=5 delims="而不是更多 要删除最后一行,您可以将实际行存储在var中并打印前一行。由于这是在(代码块)中,您需要DelayedExpansion然后(可能会删除csv中的感叹号)。

要删除列,您需要指定分隔符和匹配的tokens参数,以省略您不想要的列。
提供“delims =”,并且您希望从6

中删除第2列和第5列
@echo off&SetLocal EnableDelayedExpansion
(   for %%f in (*.csv) do (
        Set "last="
        for /f "tokens=1,3-4,6 delims=," %%A in ('more +5 %%f') do (
            if defined last echo %%f,!last!
            Set "last=%%A,%%B,%%C,%%D"
        )
    )
) >Output.csv

我的示例输出:

Output.csv


FatTwin1.csv,1,3,4,6
FatTwin2.csv,13,15,16,18

要解决空字段的问题,您可以{/ 1}} csv文件通过im / /使用powershell导出它们将双引号所有字段。

此cmd行将调用powershell以在当前文件夹中导入-csv和export-csv所有csv文件,并使用附加的normalize存储到名称。这要求文件具有唯一列名称的标头。

_dq

答案 1 :(得分:0)

新的PowerShell回答。这个脚本:

$KeepCols = @(
"EMPLOYEE NAME",
"EMPLOYEE PAYROLL ID",
"FIRST NAME",
"LAST NAME",
"DEPARTMENT NAME",
"REG",
"REG Pay",
"OT1 Hours",
"OT1 Pay",
"TOTAL Hours",
"Total Pay ")

Get-ChildItem '*.csv' -Exclude '*_dq.csv'|
  ForEach-Object {
    $fn=$_.Fullname
    "Processing $fn"
    (Get-Content $fn) | Select-Object -Skip 4 | ConvertFrom-Csv|
    Where-Object "EMPLOYEE NAME" -ne "TOTAL"|
    Select-Object -Property $KeepCols|
      Export-Csv -path ($fn.replace('.csv','_dq.csv')) -NoType
  }

将从上面的示例生成此输出:

"EMPLOYEE NAME","EMPLOYEE PAYROLL ID","FIRST NAME","LAST NAME","DEPARTMENT NAME","REG","REG Pay","OT1 Hours","OT1 Pay","TOTAL Hours","Total Pay "
"Oc Br","999","Oc","Br","Fulfillment","8.00","114.8",".53","11.41","8.53","126.21 "
"Brat Hat","3423","Brat","Hat","Logistics Admin","5.42","75.88","","","5.42","75.88 "
"Tod Vindo","","Tod","Vindo","Logistics Admin","8.00","128","1.18","28.32","9.18","156.32 "

所以步骤

  • 处理所有.csv个文件,不包括尾随_dq的文件,并附加_dq保存
  • 剥离4行
  • 删除不需要的列
  • 正确引用字段
  • 删除最后一个TOTAL

完成了。还有什么要做:

  • 合并文件(没有,现在只有一个标题行)
  • 检查最后一列是否确实有尾随空格。