使用批处理文件从CSV文件中删除特殊字符

时间:2018-12-24 21:55:25

标签: csv batch-file special-characters

我有一个包含18个字段的csv文件。我已经写了一个批处理文件来处理数据。除了从“ DEVILS DUE / 1FIRST COMICS,LLC”发布者中删除逗号外,所有其他操作都有效。该字段无法正确解析。我曾尝试查看其他批处理文件作为示例,但是我对snytax并不熟悉。

@echo off & Setlocal EnableDelayedExpansion
( FOR /f "tokens=1-18 delims=," %%A in ('More +4 datatest.csv') do (
rem H is the department code
rem S is the sales tax code
rem Q is the publisher code
    Set "H=%%H"
    Set "S=T"
    Set "Q=%%Q"
    if "%%Q"=="BOOM! STUDIOS" Set "Q=BOOM STUDIOS"
    if "%%Q"=="DEVILS DUE /1FIRST COMICS, LLC" Set "Q=DEVILS DUE"
    if "%%H"=="1" Set "H=1005" 
    if "%%H"=="1" Set "S=N"
    if "%%H"=="2" Set "H=1009" 
    if "%%H"=="2" Set "S=N"
    if "%%H"=="3" Set "H=1008"
    if "%%H"=="4" Set "H=1002"
    if "%%H"=="5" Set "H=1006"
    if "%%H"=="6" Set "H=1003"
    if "%%H"=="7" Set "H=1011"
    if "%%H"=="8" Set "H=1011"
    if "%%H"=="9" Set "H=1004"
    if "%%H"=="10" Set "H=1016"
    if "%%H"=="11" Set "H=1015"
    if "%%H"=="12" Set "H=1015"
    if "%%H"=="13" Set "H=1011"
    if "%%H"=="14" Set "H=1009" 
    if "%%H"=="14" Set "S=N"
    if "%%H"=="15" Set "H=1013"  
    if "%%H"=="16" Set "H=1017"
    echo "",%%~M,%%~N,%%~L,"","","","","",!H!,"","",ITEM,"","",%%~D,%%Q,"","",%%E,"",%%E,"","","","","","","","","","","","","","",%%A,"","","","","",!S!,N,"","",DIAMOND,%%B,"",""
  )
)>paygoinvoice.csv
@echo on

2 个答案:

答案 0 :(得分:1)

您遇到的问题是,delims=,循环中的FOR导致代码中的DEVILS DUE /1FIRST COMICS, LLC ANY 逗号更改为一个空间。

先用Tokens=然后用%%H组合 = DEVILS DUE /1FIRST COMICS ---和--- %%I = LLC

一个快速而肮脏的修复方法(emem)(据我所知)是将所有", "更改为不同的东西,然后再将其运行到主函数中。在我的示例中,我使用了1Comma1。这会将您的IF搜索更改为DEVILS DUE /1FIRST COMICS1Comma1 LLC

已修复。蝙蝠:

@echo off & Setlocal EnableDelayedExpansion

Rem | Replace all ", " with "1Comma1"
for /f "tokens=1,* delims=¶" %%A in ('"type datatest.csv"') do (
    SET string=%%A
    setlocal EnableDelayedExpansion
    SET modified=!string:, =1Comma1 !

    >> datatest.csv.TEMP echo(!modified!
    endlocal
)

Rem | Main .CSV Edit Function
( FOR /f "tokens=1-8* delims=," %%A in ('More +4 datatest.csv.TEMP') do (
    Set "ItemData=%%H"
    if "%%H"=="1" Set "ItemData=1005"
    if "%%H"=="3" Set "ItemData=1008"
    if "%%H"=="BOOM STUDIOS" Set "ItemData=NEW STUDIOS"
    if "%%H"=="DEVILS DUE /1FIRST COMICS1Comma1 LLC" Set "ItemData=DEVILS DUE"

    echo %%A,%%B,%%C,%%D,%%E,%%F,%%G,!ItemData!,%%I
  )
)>paygoinvoice.txt
del datatest.csv.TEMP

@echo on

PS:我在上面的示例中使用的代码摘自您最近关于该主题的帖子。只需将新代码添加到其所属位置即可。

还请记住,EnableDelayedExpansion会自动从!循环或FOR语句的输出中删除IF

答案 1 :(得分:0)

由于您从未展示过真实的示例输入文件,因此很难提供帮助。

for /f解析csv文件的问题是:

  1. 遵守双引号字段,并且还将标记包括的逗号
  2. 仅将相邻的定界符视为一个,而忽略前导定界符。

因此,第一个问题适用,第二个问题是unknwon。

一种解决方法是解析有问题的字段,并将引用的引号作为参数传递给遵守的子级,并在那里进行处理

为简化数组值的处理,存在一种将列表扩展到数组的技术,请参见以下批处理中针对DepCodeSTaxCode实现的方法(正如我在my answer中所暗示的那样)您的previous question):

@echo off & Setlocal EnableDelayedExpansion

:: Build array DepCode[1..16]
Set i=0&Set "DepCode=,1005,1009,1008,1002,1006,1003,1011,1011,1004,1016,1015,1015,1011,1009,1013,1017"
Set "DepCode=%DepCode:,="&Set /a i+=1&Set "DepCode[!i!]=%"
:: Set DepCode

:: Build array STaxCode[1..16]
Set i=0&Set "STaxCode=,N,N,S,S,S,S,S,S,S,S,S,S,S,N,S,S"
Set "STaxCode=%STaxCode:,="&Set /a i+=1&Set "STaxCode[!i!]=%"
:: Set STaxCode

( FOR /f "tokens=1-16* delims=," %%A in ('More +4 SO_53917950.csv') do (
    rem H is the department code
    Set "H=!DepCode[%%~H]!"
    rem S is the sales tax code
    Set "S=!STaxCode[%%~H]!"
    rem Q is the publisher code 17th field and 18th field 
    Call :RemoveComma %%Q 

rem echo "",%%~M,%%~N,%%~L,"","","","","",!H!,"","",ITEM,"","",%%~D,"!PubCode!","","",%%E,"",%%E,"","","","","","","","","","","","","","",%%A,"","","","","",!S!,N,"","",DIAMOND,%%B,"",""
    echo "%%~A","%%~B","%%~C","%%~D","%%~E","%%~F","%%~G","!H!","%%~I","%%~J","%%~K","%%~L","%%~M","%%~N","%%~O","%%~P","!PubCode!","!R!","!S!"

  )
)>paygoinvoice.csv
Goto :Eof

:RemoveComma
Set "R=%~2"
:: remove comma from field
::Set "PubCode=%PubCode:,= %"

:: split field at first comma or slash/backslash
for /f "delims=,/\" %%a in (%1) do Set "PubCode=%%a" 

此构建的输入文件SO_.csv:

first  line to remove
second line to remove
third  line to remove
fourth line to remove
"HeadA","HeadB","HeadC","HeadD","HeadE","HeadF","HeadG","HeadH","HeadI","HeadJ","HeadK","HeadL","HeadM","HeadN","HeadO","HeadP","HeadQ","HeadR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","2","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","3","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","4","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","5","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","6","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","7","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","8","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","9","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE /1FIRST COMICS, LLC","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","10","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","11","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","12","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","13","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","14","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","15","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","16","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM! STUDIOS","ColR"

将具有以下输出(由于额外的调用,处理速度明显降低):

"HeadA","HeadB","HeadC","HeadD","HeadE","HeadF","HeadG","","HeadI","HeadJ","HeadK","HeadL","HeadM","HeadN","HeadO","HeadP","HeadQ","HeadR",""
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1005","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1009","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1008","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1002","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1006","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1003","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1004","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","DEVILS DUE ","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1016","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1015","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1015","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1011","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1009","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","N"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1013","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"
"ColA","ColB","ColC","ColD","ColE","ColF","ColG","1017","ColI","ColJ","ColK","ColL","ColM","ColN","ColO","ColP","BOOM STUDIOS","ColR","S"