试图找到一种方法来对100个文件的所有名称进行grep。 grepping每个文件中所有可用的名称必须出现在同一行。
FILE1
"company":"COMPANY1","companyDisplayName":"CM1","company":"COMPANY2","companyDisplayName":"CM2","company":"COMPANY3","companyDisplayName":"CM3",
FILE2
"company":"COMPANY99","companyDisplayName":"CM99"
我真正想要的输出是(包括文件名作为前缀。)
FILE1:COMPANY1,COMPANY2,COMPANY3
FILE2:COMPANY99
我尝试了grep -oP '(?<="company":")[^"]*' *
,但得到了这样的结果:
FILE1:COMPANY1
FILE1:COMPANY2
FILE1:COMPANY3
FILE2:COMPANY99
答案 0 :(得分:0)
请您尝试以下。
awk -F'[,:]' '
BEGIN{
OFS=","
}
{
for(i=1;i<=NF;i++){
if($i=="\"company\""){
val=(val?val OFS:"")$(i+1)
}
}
gsub(/\"/,"",val)
print FILENAME":"val
val=""
}
' Input_file1 Input_file2
说明: 添加上述代码的说明。
awk -F'[,:]' ' ##Starting awk program here and setting field separator as colon OR comma here for all lines of Input_file(s).
BEGIN{ ##Starting BEGIN section of awk here.
OFS="," ##Setting OFS as comma here.
} ##Closing BEGIN BLOCK here.
{ ##Starting main BLOCK here.
for(i=1;i<=NF;i++){ ##Starting a for loop which starts from i=1 to till value of NF.
if($i=="\"company\""){ ##Checking condition if field value is equal to "company" then do following.
val=(val?val OFS:"")$(i+1) ##Creating a variable named val and concatenating its own value to it each time cursor comes here.
} ##Closing BLOCK for if condition here.
} ##Closing BLOCK for, for loop here.
gsub(/\"/,"",val) ##Using gsub to gklobally substitute all " in variable val here.
print FILENAME":"val ##Printing filename colon and variable val here.
val="" ##Nullifying variable val here.
} ##Closing main BLOCK here.
' Input_file1 Input_file2 ##Mentioning Input_file names here.
输出如下。
Input_file1:COMPANY1,COMPANY2,COMPANY3
Input_file2:COMPANY99
编辑: :在OP需要使用grep
并希望从其输出中获得最终输出的情况下添加解决方案(尽管我建议使用{{ 1}}解决方案本身,因为我们没有使用多个命令或子外壳。
awk
答案 1 :(得分:0)
有两种工具可以获取grep命令的输出,并以所需的方式对其进行重新格式化。第一个工具是GNU datamash。第二个是eBay tsv-summarize软件包中的tsv-utils(免责声明:我是作者)。两种工具都可以通过类似的方式解决此问题:
$ # The grep output
$ echo $'FILE1:COMPANY1\nFILE1:COMPANY2\nFILE1:COMPANY3\nFILE2:COMPANY99' > grep-output.txt
$ cat grep-output.txt
FILE1:COMPANY1
FILE1:COMPANY2
FILE1:COMPANY3
FILE2:COMPANY99
$ # Using GNU datamash
$ cat grep-output.txt | datamash -field-separator : --group 1 unique 2
FILE1:COMPANY1,COMPANY2,COMPANY3
FILE2:COMPANY99
$ # Using tsv-summarize
$ cat grep-output.txt | tsv-summarize --delimiter : --group-by 1 --unique-values 2 --values-delimiter ,
FILE1:COMPANY1,COMPANY2,COMPANY3
FILE2:COMPANY99