如何用awk中的<return>替换模式“,,”?

时间:2017-11-17 01:36:09

标签: bash awk sed gawk tr

我正在做一个ldapsearch查询,返回结果如下

John Joe jjoe@company.com +1 916 662-4727  Ann Tylor Atylor@company.com (987) 654-3210  Steve Harvey sharvey@company.com 4567893210  (321) 956-3344  ...

正如您所看到的,每个个人记录输出之间都有一个空格,电话号码可能以+1开头,数字或括号之间可能有空白,最后在个人记录之间有两个空格。例如:

我想将这些条目转换为以下格式:

John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344
...

所以基本上用一个逗号“,”和两个空格替换一个空格,以便最后我每行有一个个人记录(以逗号分隔)。例如:

我正在尝试awk并设法替换为“,”这使得

<blank><blank> to double comma ",,". 
But can't figure out how to turn ",," to <RETURN>

11/22/2017 ---- ******更新****** --------- 11/22/2017

我让这条赛道太拥挤了。我将发布一个更详细的新问题。

3 个答案:

答案 0 :(得分:2)

对于您的请求,需要使用sed进行大量替换。

$ cat sed-script
s/\ \ ([A-Za-z])/\n\1/g;        # replace alphabets which appended double spaced to '\n'
s/\ \ /,/g;                     # replace remaining double spaces to ',' 
s/([A-Za-z]) /\1,/g;            # releace the space appended alphabets to ',' 
s/\+1//;                        # eliminate +1
s/[ ()-]//g;                    # eliminate space, parenthesis, or dash
s/([^0-9])([0-9]{3})/\1(\2) /g; # modify first 3 numeric embraced by parenthesis
s/([0-9]{4}[^0-9])/-\1/g;       # prepend a '-' to last 4 numerics

$ sed -r -f sed-script file 
John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344,...

答案 1 :(得分:1)

如果您的Input_file与显示的示例相同,那么关注awk可能对您有帮助。

awk --re-interval '{gsub(/[0-9]{3}-[0-9]{4} +/,"&\n");print}'  Input_file

我正在使用awk的旧版本,所以我在新版--re-interval中提到了awk,无需提及。

说明: 此处也为解决方案添加说明。

awk --re-interval '{               ##using --re-interval to use the extended regex as I have old version of awk.
gsub(/[0-9]{3}-[0-9]{4} +/,"&\n"); ##Using gsub utility(global substitute) of awk where I am checking 3 continuous dots then dash(-) then 4 continuous digits and till space with same regex match and NEW LINE.
print                              ##printing the line of Input_file
}'  Input_file                     ##Mentioning the Input_file here.

答案 2 :(得分:0)

为了您的兴趣,您可以用Perl说:

perl -e '
while (<>) {
    s/  /\n/g;
    s/ /,/g;
    s/(\+1,)?\(?(\d{3})\)?[-,]?(\d{3})[-,]?(\d{4})/($2) $3-$4/g;
    print;
}' file