输入:
20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30
理想的结果:
20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.224.213/30
如何摆脱引号之间的逗号? 引号之间也有不带逗号的行。
我需要删除里面的逗号"JUDICIARY, STATE COURTS (STATE COURTS)"
(两行都出现在一行)。
some lines有两个逗号在双
之间的字段答案 0 :(得分:1)
这是一个演示如何操作的脚本 - 欢迎来到goto
sed
的世界。这是使用BSD sed
编写的,它使用-E
来启用扩展正则表达式; GNU sed
使用-r
执行相同的任务。
sed -E -e 's/^/A: /p; s/^A: /B: /' \
-e ':again' \
-e 's/^(([^"]*|"[^",]*")*)("[^"]*),([^"]*")/\1\3\4/' \
-e 't again' \
data
假设数据位于名为data
的文件中。第一个-e
只是回显前缀为A:
的原始输入,然后将前缀更改为B:
。这是调试材料。第二个-e
生成一个可以跳转的标签again
。如果上一步做了替换,则第四个-e
会跳转到again
标签。
所有的兴奋都在第三个-e
。该模式寻找行的开始,然后是一系列零序或多次出现的序列
“不是双引号”或“双引号后跟零或更多'不是双引号'和双引号”,后跟双引号,一系列'不是双引号,逗号,更多'不是双引号'和a双引号。这被前缀替换,双引号之间的逗号前面的部分和双引号之间的逗号之后的部分。
给定一个数据文件:
2000,"xxxx,xxxx",192.168.3.2
2000,"xx,xx,xx",192.16.3.2
2000,"xxxxxxxx",192.168.3.2
20000000,"xxxxxxxxxxxx,xxxxxxxxxxxx",192.168.3.2,"yyyyy,yyyyy"
20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
201,"x,x",192.168.3.2,"y,y","aaaa,cccc,dddd",192,"zzzz",234
201,"x,x",192.168.3.2,"yyy"
201,"xx",192.168.3.2,"yyy",2211
201,"xxx",192.168.3.2,"y,y"
201,"xxx",192.168.3.2,"yyy"
201,"x,x",192.168.3.2,"y,y"
Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30
脚本生成输出:
A: 2000,"xxxx,xxxx",192.168.3.2
B: 2000,"xxxxxxxx",192.168.3.2
A: 2000,"xx,xx,xx",192.16.3.2
B: 2000,"xxxxxx",192.16.3.2
A: 2000,"xxxxxxxx",192.168.3.2
B: 2000,"xxxxxxxx",192.168.3.2
A: 20000000,"xxxxxxxxxxxx,xxxxxxxxxxxx",192.168.3.2,"yyyyy,yyyyy"
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2,"yyyyyyyyyy"
A: 20000000,"xxxxxxxxxxxxx,xxxxxxxxxxx",192.168.3.2
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
A: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
B: 20000000,"xxxxxxxxxxxxxxxxxxxxxxxx",192.168.3.2
A: 201,"x,x",192.168.3.2,"y,y","aaaa,cccc,dddd",192,"zzzz",234
B: 201,"xx",192.168.3.2,"yy","aaaaccccdddd",192,"zzzz",234
A: 201,"x,x",192.168.3.2,"yyy"
B: 201,"xx",192.168.3.2,"yyy"
A: 201,"xx",192.168.3.2,"yyy",2211
B: 201,"xx",192.168.3.2,"yyy",2211
A: 201,"xxx",192.168.3.2,"y,y"
B: 201,"xxx",192.168.3.2,"yy"
A: 201,"xxx",192.168.3.2,"yyy"
B: 201,"xxx",192.168.3.2,"yyy"
A: 201,"x,x",192.168.3.2,"y,y"
B: 201,"xx",192.168.3.2,"yy"
A: Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY, STATE COURTS (STATE COURTS)",112.78.224.213/30
B: Exchange subsidary,Passed,00021423SNG,R-JAM-05-03,US (First Exchange),20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.212.12/30,00052312SNG,R-JPODIU-023-07,US (First Exchange) ,20000000,"JUDICIARY STATE COURTS (STATE COURTS)",112.78.224.213/30
请注意:这很难。如果您有选项,请使用知道CSV格式的工具。例如,Python附带一个CSV模块; Perl有Text::CSV
(以及辅助模块Text::CSV_PP
和Text::CSV_XS
)可以处理这个问题;有一些用于处理CSV文件的自定义工具。
另请注意,Microsoft支持与RFC 4180略有不同的表示法,这是互联网世界试图合理化Microsoft使用的内容(初步估算)。