如何将数据解析为CSV文件中的唯一列

时间:2018-02-09 20:49:18

标签: perl csv awk

所以,我有一个包含数千列的csv文件。例如,假设我将第一列作为服务器名称,将剩余列作为打开的端口。

例如:

SERVER1,22,25,110,3389,etc
SERVER2,22,110,3389,45001,etc
SERVER3,3389,45001,etc

我试图找到一种方法,使用任何命令行工具将其处理成唯一的列,以便上面的内容成为:

SERVER1,22,25,110,3389,,etc
SERVER2,22,,110,3389,45001,etc
SERVER3,,,,3389,45001,etc

任何想法都表示赞赏。谢谢!

3 个答案:

答案 0 :(得分:1)

你可以试试这个awk

awk -F, '
NR==FNR{
  for(i=2;i<=NF;i++)
  a[$i];
  next
}
{
  i=2;
  b=$1;
  for(j in a)
  {
    if($i == j)
    {
      b=b FS $i;
      i++
    }
    else
      b=b FS
  }
  print b
}
' infile infile

答案 1 :(得分:1)

使用GNU awk for sorted_in:

$ cat tst.awk
BEGIN {
    FS=OFS=","
    PROCINFO["sorted_in"] = "@ind_num_asc"
}
NR==FNR {
    for (i=2; i<=NF; i++) {
        allVals[$i]
    }
    next
}
{
    delete curVals
    for (i=1; i<=NF; i++) {
        curVals[$i]
    }
    printf "%s", $1
    for (i in allVals) {
        printf "%s%s", OFS, (i in curVals ? i : "")
    }
    print ""
}

$ awk -f tst.awk file file
SERVER1,etc,22,25,110,3389,
SERVER2,etc,22,,110,3389,45001
SERVER3,etc,,,,3389,45001

答案 2 :(得分:0)

perl的

perl -MSet::Scalar -e '
    $ports = Set::Scalar->new;
    open $fh, "<", shift @ARGV;
    while (<$fh>) {
        chomp;
        @fields = split /,/;
        $ports->insert(@fields[1..$#fields]);
    }
    @all_ports = sort {$a <=> $b} $ports->members;
    seek $fh, 0, 0;
    while (<$fh>) {
        chomp;
        @fields = split /,/;
        print $fields[0];
        $ports = Set::Scalar->new(@fields[1..$#fields]);
        print(",", ($ports->has($_) ? $_ : "")) for @all_ports;
        print "\n"
    }
' file.csv
SERVER1,etc,22,25,110,3389,
SERVER2,etc,22,,110,3389,45001
SERVER3,etc,,,,3389,45001
不要担心&#34;等等#34;首先出现:端口按数字排序,字符串&#34;等等#34;被视为数字零。