awk或sed CSV文件操作

时间:2011-03-10 23:45:25

标签: perl unix sed awk

"a004-1b","North","at006754"
"a004-1c","south","atytgh0"
"a004-1d","east","atrthh"
"a010-1a","midwest","atyu"
"a010-1b","south","rfg67"

我想打印第一列和第二列,没有任何额外的字符我想要消除所有(“”和第三列)提前感谢

6 个答案:

答案 0 :(得分:4)

awk -F'^"|","|"$' '{print $2,$3}' ./infile.csv

上述脚本甚至可以处理嵌入双引号或逗号的字段。唯一的缺点(如果你可以称之为)是第一个字段从$2

开始

概念证明

$ awk -F'^"|","|"$' '{print $2,$3}' ./infile.csv
a004-1b North

a004-1c south

a010-1a midwest

a010-1b south

答案 1 :(得分:2)

你需要GNU Awk 4来实现这个目的:

$ gawk -vFPAT='[^",]+' '{print $1,$2}'

我喜欢这种新的“场模式”功能。这是我的新锤子,一切都是钉子。请在http://www.gnu.org/software/gawk/manual/html_node/Splitting-By-Content.html

上阅读

(以这种方式写,它不会考虑嵌入的逗号或引号,因为问题意味着不需要这样做。)

答案 2 :(得分:0)

如果你想要它“纯粹”awk或sed,这将不适合账单,但否则它有效:

awk -F, '{print $1 " " $2}' | tr -d '"'

答案 3 :(得分:0)

如果您正在使用awk,为什么要在其上添加Perl标记?

Perl:

#!/usr/bin/env perl

use strict;
use warnings;

use Data::Dumper;

# Make Data::Dumper pretty
$Data::Dumper::Sortkeys = 1;
$Data::Dumper::Indent   = 1;

# Set maximum depth for Data::Dumper, zero means unlimited
local $Data::Dumper::Maxdepth = 0;

use Text::CSV;

my $csv = Text::CSV->new();
while( my $row = $csv->getline( \*DATA )){
  print 'row: ', Dumper $row;
}

__DATA__
"a004-1b","North","at006754"
"a004-1c","south","atytgh0""a004-1d","east","atrthh"
"a010-1a","midwest","atyu"
"a010-1b","south","rfg67"

答案 4 :(得分:0)

awk -F'\"|\,' '{print $2,$5}' sample

答案 5 :(得分:0)

不处理嵌入式双引号:

sed -e 's/^"\([^"]*\)","\([^"]*\)".*/\1 \2/'

处理它们:

sed -n -e 's/^"//;s/"$//;s/","/ /;s/","/\n/;P'

以上情况甚至可用于1或2场输入。