如何在unix

时间:2016-04-13 02:00:52

标签: linux unix command

输入:

20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30

理想的结果:

20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2     
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.224.213/30

如何摆脱引号之间的逗号? 引号之间也有不带逗号的行。

我需要删除里面的逗号"JUDICIARY, STATE COURTS (STATE COURTS)"(两行都出现在一行)。

some lines有两个逗号在双

之间的字段

1 个答案:

答案 0 :(得分:1)

这是一个演示如何操作的脚本 - 欢迎来到goto sed的世界。这是使用BSD sed编写的,它使用-E来启用扩展正则表达式; GNU sed使用-r执行相同的任务。

sed -E -e 's/^/A: /p; s/^A: /B: /' \
       -e ':again' \
       -e 's/^(([^"]*|"[^",]*")*)("[^"]*),([^"]*")/\1\3\4/' \
       -e 't again' \
       data

假设数据位于名为data的文件中。第一个-e只是回显前缀为A:的原始输入,然后将前缀更改为B:。这是调试材料。第二个-e生成一个可以跳转的标签again。如果上一步做了替换,则第四个-e会跳转到again标签。

所有的兴奋都在第三个-e。该模式寻找行的开始,然后是一系列零序或多次出现的序列 “不是双引号”或“双引号后跟零或更多'不是双引号'和双引号”,后跟双引号,一系列'不是双引号,逗号,更多'不是双引号'和a双引号。这被前缀替换,双引号之间的逗号前面的部分和双引号之间的逗号之后的部分。

给定一个数据文件:

2000,"xxxx,xxxx",192.168.3.2
2000,"xx,xx,xx",192.16.3.2
2000,"xxxxxxxx",192.168.3.2
20000000,"xxxxxxxxxxxx,xxxxxxxxxxxx",192.168.3.2,"yyyyy,yyyyy"
20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
201,"x,x",192.168.3.2,"y,y","aaaa,cccc,dddd",192,"zzzz",234
201,"x,x",192.168.3.2,"yyy"
201,"xx",192.168.3.2,"yyy",2211
201,"xxx",192.168.3.2,"y,y"
201,"xxx",192.168.3.2,"yyy"
201,"x,x",192.168.3.2,"y,y"
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30 

脚本生成输出:

A: 2000,"xxxx,xxxx",192.168.3.2
B: 2000,"xxxxxxxx",192.168.3.2
A: 2000,"xx,xx,xx",192.16.3.2
B: 2000,"xxxxxx",192.16.3.2
A: 2000,"xxxxxxxx",192.168.3.2
B: 2000,"xxxxxxxx",192.168.3.2
A: 20000000,"xxxxxxxxxxxx,xxxxxxxxxxxx",192.168.3.2,"yyyyy,yyyyy"
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2,"yyyyyyyyyy"
A: 20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
A: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
A: 201,"x,x",192.168.3.2,"y,y","aaaa,cccc,dddd",192,"zzzz",234
B: 201,"xx",192.168.3.2,"yy","aaaaccccdddd",192,"zzzz",234
A: 201,"x,x",192.168.3.2,"yyy"
B: 201,"xx",192.168.3.2,"yyy"
A: 201,"xx",192.168.3.2,"yyy",2211
B: 201,"xx",192.168.3.2,"yyy",2211
A: 201,"xxx",192.168.3.2,"y,y"
B: 201,"xxx",192.168.3.2,"yy"
A: 201,"xxx",192.168.3.2,"yyy"
B: 201,"xxx",192.168.3.2,"yyy"
A: 201,"x,x",192.168.3.2,"y,y"
B: 201,"xx",192.168.3.2,"yy"
A: Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30 
B: Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.224.213/30 

请注意:这很难。如果您有选项,请使用知道CSV格式的工具。例如,Python附带一个CSV模块; Perl有Text::CSV(以及辅助模块Text::CSV_PPText::CSV_XS)可以处理这个问题;有一些用于处理CSV文件的自定义工具。

另请注意,Microsoft支持与RFC 4180略有不同的表示法,这是互联网世界试图合理化Microsoft使用的内容(初步估算)。