我有一个这样的输入文件:
COL1: VALUE1 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyyy23, NAME=AUDIT
COL1: VALUE2 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyy23, NAME=generic
XYZ:2, COL1: 289 , TREK:MRP, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, NAME=Oil, trial=TREE
我想要这样的输出:
COL1: VALUE1 , NAME=AUDIT
COL1: VALUE2 , NAME=generic
COL1: 289 , NAME=Oil
如何在命令行上使用awk/grep/sed
而不使用awk
,gawk
等任何高级版本的nawk
来实现此目的?
基本上我想获得COL1
(即:
和=
之后的文本)和NAME
的值,而不管它们在行中的什么位置。
看到“ NAME”列的位置稍有改变。
这是我能想到的:
awk -F"," '{print $1, $6}' file.txt
COL1: VALUE1 NAME=AUDIT
COL1: VALUE2 NAME=generic
XYZ:2 NAME=Oil
答案 0 :(得分:4)
您可以尝试Perl单线版
perl -lne ' /(COL1:\s*\S+).+(NAME=\w+)/ and print "$1,\t$2" ' input_file
使用您的输入:
$ cat sach.txt
COL1: VALUE1 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyyy23, NAME=AUDIT
COL1: VALUE2 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyy23, NAME=generic
XYZ:2, COL1: 289 , TREK:MRP, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, NAME=Oil, trial=TREE
$ perl -lne ' /(COL1:\s*\S+).+(NAME=\w+)/ and print "$1,\t$2" ' sach.txt
COL1: VALUE1, NAME=AUDIT
COL1: VALUE2, NAME=generic
COL1: 289, NAME=Oil
$
说明:
perl -lne # use -n for suppressing print default at the end of each line
' /(COL1:\s*\S+).+(NAME=\w+)/ # Match pattern and capture them in capture groups first () will be $1 and second () will be in $2
# First () matches COL1:\s*\S+ => COL1: followed by zero or more spaces using \s* and \S+ for non-space characters
# .+ => match all strings between first () and second ()
# Seecond () matches NAME followed by a word \w+
and # bind on the success of previous condition /..../
print "$1,\t$2" # print the $1 and $2 captured variables
' input_file
答案 1 :(得分:1)
您能否请尝试(用GNU SYS_REFCURSOR
测试和编写)。
awk
我在每一行中对字符串awk '
BEGIN{
OFS=" , "
}
match($0,/COL[0-9]+: [^,]*/){
val=substr($0,RSTART,RLENGTH)
match($0,/NAME[^,]*/)
print val OFS substr($0,RSTART,RLENGTH)
val=""
}
' Input_file
和COL
的匹配进行了汇总,因此,如果任何一行中都没有字符串NAME
,则可能不会在其中打印任何内容它。
如果在一行中未找到字符串COL
,而您仍要打印COL
字符串匹配项,然后尝试执行以下操作。
NAME
说明: 现在添加上述代码的说明。
awk '
BEGIN{
OFS=" , "
}
match($0,/COL[0-9]+: [^,]*/){
val=substr($0,RSTART,RLENGTH)
}
match($0,/NAME[^,]*/){
if(val){
printf "%s%s",val,OFS
}
print substr($0,RSTART,RLENGTH)
}
' Input_file
从awk ' ##Starting awk program heer.
BEGIN{ ##Starting BEGIN section for awk code here.
OFS=" , " ##Setting OFS output field separator as space comma space here.
} ##Closing BEGIN section here.
match($0,/COL[0-9]+: [^,]*/){ ##Using match of awk OOTB function to match a REGEX string COL till comma here.
val=substr($0,RSTART,RLENGTH) ##If a match is foundthen creating variable val whose value is sub string of matched regex starting to till end value of it.
match($0,/NAME[^,]*/) ##Again using match to match string from NAME to till next comma comes.
print val OFS substr($0,RSTART,RLENGTH) ##Printing value of variable val OFS and substring of current line whose sarting point is RSTART and end point is RLENGTH.
val="" ##Nullifying variable val here.
}
' Input_file ##Mentioning Input_file name here.
页添加参考:
man awk
答案 2 :(得分:0)
With grep you can maybe try something like that :
while read line; do COL=$(echo $line | grep -o "COL1:.*,"); NAME=$(echo $line | grep -o "NAME=[a-zA-Z]*"); echo $COL $NAME >> new_file.txt; done < your_file.txt
The regexp in this example assume that the value after COL1 are always followed by a "," (then it take every characters between the : and ,) so you might have to adapt it to fit your file (same for the regexp used for NAME).
答案 3 :(得分:0)
尝试一下:
$ sed 'H;s/.*NAME=/NAME=/;s/ *,.*//;x;s/^.*COL1/COL1/;s/ *,.*//;G;s/\n/\t, /;' file
COL1: VALUE1 , NAME=AUDIT
COL1: VALUE2 , NAME=generic
COL1: 289 , NAME=Oil
使用了保留空间,并使用\t
进行对齐。
答案 4 :(得分:0)
由gnu sed
$ sed -E 's/^([^,]+,\s*)?(col1:[^,]+).+(,\s*name=\w+).*/\2\3/i' file.txt