所以,我有一个包含数千列的csv文件。例如,假设我将第一列作为服务器名称,将剩余列作为打开的端口。
例如:
SERVER1,22,25,110,3389,etc
SERVER2,22,110,3389,45001,etc
SERVER3,3389,45001,etc
我试图找到一种方法,使用任何命令行工具将其处理成唯一的列,以便上面的内容成为:
SERVER1,22,25,110,3389,,etc
SERVER2,22,,110,3389,45001,etc
SERVER3,,,,3389,45001,etc
任何想法都表示赞赏。谢谢!
答案 0 :(得分:1)
你可以试试这个awk
awk -F, '
NR==FNR{
for(i=2;i<=NF;i++)
a[$i];
next
}
{
i=2;
b=$1;
for(j in a)
{
if($i == j)
{
b=b FS $i;
i++
}
else
b=b FS
}
print b
}
' infile infile
答案 1 :(得分:1)
使用GNU awk for sorted_in:
$ cat tst.awk
BEGIN {
FS=OFS=","
PROCINFO["sorted_in"] = "@ind_num_asc"
}
NR==FNR {
for (i=2; i<=NF; i++) {
allVals[$i]
}
next
}
{
delete curVals
for (i=1; i<=NF; i++) {
curVals[$i]
}
printf "%s", $1
for (i in allVals) {
printf "%s%s", OFS, (i in curVals ? i : "")
}
print ""
}
$ awk -f tst.awk file file
SERVER1,etc,22,25,110,3389,
SERVER2,etc,22,,110,3389,45001
SERVER3,etc,,,,3389,45001
答案 2 :(得分:0)
perl的
perl -MSet::Scalar -e '
$ports = Set::Scalar->new;
open $fh, "<", shift @ARGV;
while (<$fh>) {
chomp;
@fields = split /,/;
$ports->insert(@fields[1..$#fields]);
}
@all_ports = sort {$a <=> $b} $ports->members;
seek $fh, 0, 0;
while (<$fh>) {
chomp;
@fields = split /,/;
print $fields[0];
$ports = Set::Scalar->new(@fields[1..$#fields]);
print(",", ($ports->has($_) ? $_ : "")) for @all_ports;
print "\n"
}
' file.csv
SERVER1,etc,22,25,110,3389,
SERVER2,etc,22,,110,3389,45001
SERVER3,etc,,,,3389,45001
不要担心&#34;等等#34;首先出现:端口按数字排序,字符串&#34;等等#34;被视为数字零。