批处理:如何从csv文件中删除所有空列

时间:2016-05-18 12:44:30

标签: batch-file

我有一个像这样的CSV文件:

P,PC,,PL,B,15feb16,P,Bay,RP,15-FEB-16,22-FEB-16,7,,,,,,11,14,138,14,16,993.42,-12,-84,-12,,,,,,,,,17,2,-10,0,0,1,1,16:05:53,15FEB16 
P,PC,,PL,I,1FEB-16,P,In,RP,15-FEB-16,22-FEB-16,7,,,,,,25,5,32,5,5,-29.7,-24,-168,-24,,,,,,,,,520,14,-10,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 
P,PC,,PC,S,15FEB16,P,Su,RP,15-FEB-16,22-FEB-16,7,,,,,,6,5,32,56,5,4.65,0,0,0,,,,,,,,,546,0,0,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 

我写的代码是:

@echo off
setlocal EnableDelayedExpansion
for /F "delims=" %%a in (C:\Pca.csv) do   (
    set line=%%a
    set line=!line:,,=, ,!
    set line=!line:,,=, ,!
    for /F "tokens=1,2,3* delims=," %%i in (^"!line!^") do (
        echo %%i,%%l>>C:\P.csv
    )
)

但它只删除第2列和第3列,无论它是空的还是包含数据。

示例输出文件应该是:

P,PC,PL,B,15feb16,P,Bay,RP,15-FEB-16,22-FEB-16,7,11,14,138,14,16,993.42,-12,-84,-12,17,2,-10,0,0,1,1,16:05:53,15FEB16 
P,PC,PL,I,1FEB-16,P,In,RP,15-FEB-16,22-FEB-16,7,25,5,32,5,5,-29.7,-24,-168,-24,520,14,-10,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 
P,PC,PC,S,15FEB16,P,Su,RP,15-FEB-16,22-FEB-16,7,6,5,32,56,5,4.65,0,0,0,546,0,0,0,0,1,1,10-MAY-201606:05:53,15-FEB-16 

2 个答案:

答案 0 :(得分:0)

假设您原来的csv看起来像这样:

id_users,,,quantity,,date
1,,,1,,2013
1,,,1,,2013
2,,,1,,2013

然后这一行应解决您的请求:

(for /f "tokens=1-3 delims=," %%a in (c:\pca.csv) do echo %%a,%%b,%%c)>c:\p.csv

导致:

id_users,quantity,date
1,1,2013
1,1,2013
2,1,2013

诀窍是:连续分隔符被视为一个。

编辑:另一种方法,结果表明,有比原始问题显示的更多的colums。

@echo off
break>out.txt
for /F "delims=" %%a in (c:\pca.csv) do call :shorten "%%a"
goto :eof

:shorten
  set "line=%~1"
:remove
  set "line=%line:,,=,%"
  echo %line%|find ",,">nul && goto :remove
  echo %line%>>c:\p.csv

break>c:\p.csv:创建输出文件(如果存在则覆盖)
用一个替换两个连续的逗号;
如果还有连续逗号,请重复 将结果行写入outfile。

答案 1 :(得分:0)

这是一个非常全面的自适应脚本,可以从CSV格式的数据中删除空列。

在显示代码之前,让我们看看使用/?调用时显示的帮助消息:

"del-empty-cols-from-csv.bat"

This script removes any empty columns from CSV-formatted data. A column is con-
sidered as empty if the related fields in all rows are empty, unless the switch
/H is given, in which case the first line (so the header) is evaluated only.
Notice that fields containing white-spaces only are not considered as empty.


USAGE:

  del-empty-cols-from-csv.bat [/?] [/H] csv_in [csv_out]

    /?      displays this help message;
    /H      specifies to regard the header only, that is the very first row,
            to determine which columns are considered as empty; if NOT given,
            the whole data, hence all rows, are taken into account instead;
    csv_in  CSV data file to process, that is, to remove empty columns of;
            these data must be correctly formatted CSV data, using the comma as
            separator and the quotation mark as text delimiter; regard that
            literal quotation marks must be doubled; there are some additional
            restrictions: the data must not contain any line-breaks; neither
            must they contain any asterisks nor question marks;
    csv_out CSV data file to write the return data to; this must not be equal
            to csv_in; note that an already existing file will be overwritten
            without prompt; if not given, the data is displayed on the console;

您可以阅读,有两种操作模式:标准(无开关)和标题模式(开关/H)。

鉴于将以下CSV数据输入脚本......:

A, ,C, ,E,F
1, , ,4,5, 
1, , , ,5, 
1, ,3,4, , 

...标准模式下返回的CSV数据看起来像......:

A,C, ,E,F
1, ,4,5, 
1, , ,5, 
1,3,4, , 

...标题模式(/H)中返回的CSV数据如下所示:

A,C,E,F
1, ,5, 
1, ,5, 
1,3, , 

提醒上述示例数据中的空格实际上必须存在于文件中;它们刚刚插入此处以更好地说明所述操作模式。

现在,这是完整的代码:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

set "OPT_HEAD=%~1"
if "%OPT_HEAD%"=="/?" (
    goto :MSG_HELP
) else if /I "%OPT_HEAD%"=="/H" (
    shift
) else if "%OPT_HEAD:~,1%"=="/" (
    set "OPT_HEAD="
    shift
) else set "OPT_HEAD="

set "CSV_IN=%~1"
if not defined CSV_IN (
    >&2 echo ERROR:  no input file specified!
    exit /B 1
)
set "CSV_OUT=%~2"
if not defined CSV_OUT set "CSV_OUT=con"

for /F "delims==" %%V in ('2^> nul set CELL[') do set "%%V="
setlocal EnableDelayedExpansion
if not defined OPT_HEAD (
    for /F %%C in ('^< "!CSV_IN!" find /C /V ""') do set "NUM=%%C"
) else set /A NUM=1
set /A LIMIT=0
< "!CSV_IN!" (
    for /L %%L in (1,1,%NUM%) do (
        set /P "LINE="
        call :PROCESS LINE LINE || exit /B !ErrorLevel!
        set /A COUNT=0
        for %%C in (!LINE!) do (
            set /A COUNT+=1
            if not defined CELL[!COUNT!] set "CELL[!COUNT!]=%%~C"
            if !LIMIT! LSS !COUNT! set /A LIMIT=COUNT
        )
    )
)
set "PAD=" & for /L %%I in (2,1,!LIMIT!) do set "PAD=!PAD!,"
> "!CSV_OUT!" (
    for /F usebackq^ delims^=^ eol^= %%L in ("!CSV_IN!") do (
        setlocal DisableDelayedExpansion
        set "LINE=%%L%PAD%"
        set "ROW="
        set /A COUNT=0
        setlocal EnableDelayedExpansion
        call :PROCESS LINE LINE || exit /B !ErrorLevel!
        for %%C in (!LINE!) do (
            endlocal
            set "CELL=%%C"
            set /A COUNT+=1
            setlocal EnableDelayedExpansion
            if !COUNT! LEQ !LIMIT! (
                if defined CELL[!COUNT!] (
                    for /F delims^=^ eol^= %%R in ("!ROW!,!CELL!") do (
                        endlocal
                        set "ROW=%%R"
                    )
                ) else (
                    endlocal
                )
            ) else (
                endlocal
            )


            setlocal EnableDelayedExpansion
        )
        if defined ROW set "ROW=!ROW:~1!"
        call :RESTORE ROW ROW || exit /B !ErrorLevel!
        echo(!ROW!
        endlocal
        endlocal
    )
)
endlocal

endlocal
exit /B


:PROCESS  var_return  var_string
set "STRING=!%~2!"
if defined STRING (
    set "STRING="!STRING:,=","!""
    if not "!STRING!"=="!STRING:**=!" goto :ERR_CHAR
    if not "!STRING!"=="!STRING:*?=!" goto :ERR_CHAR
)
set "%~1=!STRING!"
exit /B


:RESTORE  var_return  var_string
set "STRING=!%~2!"
if "!STRING:~,1!"==^""" set "STRING=!STRING:~1!"
if "!STRING:~-1!"==""^" set "STRING=!STRING:~,-1!"
if defined STRING (
    set "STRING=!STRING:","=,!"
)
set "%~1=!STRING!"
exit /B


:ERR_CHAR
endlocal
>&2 echo ERROR:  `*` and `?` are not allowed!
exit /B 1


:MSG_HELP
echo(
echo("%~nx0"
echo(
echo(This script removes any empty columns from CSV-formatted data. A column is con-
echo(sidered as empty if the related fields in all rows are empty, unless the switch
echo(/H is given, in which case the first line ^(so the header^) is evaluated only.
echo(Notice that fields containing white-spaces only are not considered as empty.
echo(
echo(
echo(USAGE:
echo(
echo(  %~nx0 [/?] [/H] csv_in [csv_out]
echo(
echo(    /?      displays this help message;
echo(    /H      specifies to regard the header only, that is the very first row,
echo(            to determine which columns are considered as empty; if NOT given,
echo(            the whole data, hence all rows, are taken into account instead;
echo(    csv_in  CSV data file to process, that is, to remove empty columns of;
echo(            these data must be correctly formatted CSV data, using the comma as
echo(            separator and the quotation mark as text delimiter; regard that
echo(            literal quotation marks must be doubled; there are some additional
echo(            restrictions: the data must not contain any line-breaks; neither
echo(            must they contain any asterisks nor question marks;
echo(    csv_out CSV data file to write the return data to; this must not be equal
echo(            to csv_in; note that an already existing file will be overwritten
echo(            without prompt; if not given, the data is displayed on the console;
echo(
exit /B