awk -F "\",\"" 'NR==1 { hdr=$0; next } $10 != prev { prev=text=$10; gsub(/[^[:alnum:]_]/,"",text); $0 = hdr "\n" $0 } { print > ("test."text".batch.csv") }' test.batch1.csv
有一个awk命令工作不正常,它会拆分文件(基于文件中的$ 10列值)并将标题放在每个文件上。 我试图理解命令行,但我不太了解。 感谢是否有人会向我解释每条线路在做什么?
答案 0 :(得分:0)
由于您没有提供输入样本,因此这是一个简化版本。
假设您要将文件拆分为键值
$ cat file
header
1
2
2
3
3
3
$ awk 'NR==1{header=$0; next} # save header
prev!=$1{fn=$1; # when value changed, set new file counter,
prev=$1; # save current key value,
$0=header RS $0} # and insert header before first record
{print > FILENAME"."fn}' file # print records to the file
$ head file.{1..3}
==> file.1 <==
header
1
==> file.2 <==
header
2
2
==> file.3 <==
header
3
3
3
答案 1 :(得分:0)
awk -F "\",\"" ' # set field separator to ","
NR==1 { # pick the header from the first record
hdr=$0; next # and skip to next record
}
$10 != prev { # if 10th the field differs from previous
prev=text=$10 # prev and text are set equal to 10th field
gsub(/[^[:alnum:]_]/,"",text) # remove all but aA-zZ, 0-9, _ from text
$0 = hdr "\n" $0 # header preceeds data
}
{ # f.ex. ..,"foo/bar_123",... would output
print > ("test."text".batch.csv") # to file test.foobar_123.batch.csv
}
' test.batch1.csv # input file
如果它不像以前那样工作,我首先检查数据文件是否在第10个字段中排序。