如何在unix中找到具有特定模式的行并从中删除新行字符

时间:2017-08-25 17:20:58

标签: shell unix awk sed scripting

如何在unix中找到具有特定模式的行并从中删除换行符。 假设我有一个逗号分隔文件

100,"John","Clerk",,,,  
101,"Dannis","Manager",,,,  
102,"Michael","Senior  

Manager",,,,  

103,"Donald","President of 

united states",,,,  

我想要的输出是

100,"John","Clerk",,,,  
101,"Dannis","Manager",,,,  
102,"Michael","Senior Manager",,,,  
103,"Donald","President of united states",,,,  

6 个答案:

答案 0 :(得分:2)

sed 解决方案:

sed -z 's/\n*//g; s/,,,,/&\n/g' file

输出:

100,"John","Clerk",,,,
101,"Dannis","Manager",,,,
102,"Michael","Senior Manager",,,,
103,"Donald","President of united states",,,,

awk

awk 'BEGIN{ RS=ORS="" }{ gsub(/\n+/," ",$0); gsub(/,,,, */,"&\n",$0); print }' file

答案 1 :(得分:0)

尝试一次跟随awk。

awk '/^$/{next} {val=$0 ~ /^[0-9]/?(val?val ORS $0:$0):(val?val OFS $0:$0)} END{print val}' Input_file

编辑:添加非单行形式的解决方案及其解释。

awk '
/^$/{   ## Checking here if a line starts from space, if yes then do following action.
   next ## next keyword will skip all further actions here.
}
{
val=$0 ~ /^[0-9]/?(val?val ORS $0:$0):(val?val OFS $0:$0) ##creating variable named val here which will check 2 conditions if a line starts with digit then it will concatenate itself with a new line and if a line statrs with non-digit value then it will concatenate its value with a space.
}
END{         ##END block of awk code here.
   print val ##printing the value of variable named val here
}
' Input_file ## Mentioning Input_file here.

答案 2 :(得分:0)

'TypeError: cursor.readConcern is not a function'.

答案 3 :(得分:0)

这可能适合你(GNU sed):

sed -r ':a;N;/^([^\n,]*,){6}/!s/\n//;ta;P;D' file

在图案空间(PS)上添加另一行,如果该行不包含6 ,,则删除换行符并重复,否则打印并删除第一行,然后重复

答案 4 :(得分:0)

如果您不介意使用 Perl

首先删除额外的换行符:

perl -pe 's/^\n//;' file 

输出:

100,"John","Clerk",,,,
101,"Dannis","Manager",,,,
102,"Michael","Senior
Manager",,,,
103,"Donald","President of
united states",,,,

然后您可以:添加新替换以删除每行的最后一个单词的换行符。为此您可以使用:

s/(\w+)\s+\n$/$1 /;

此处\w+匹配Seniorof并将其保存在$1中,您可以将其与/$1 /一起使用,并且明显的部分是单个空格: 之后的$1

最后我们有:

perl -pe 's/^\n//;s/(\w+)\s+\n$/==>$1<== /;' file

输出:

100,"John","Clerk",,,,
101,"Dannis","Manager",,,,
102,"Michael","==>Senior<== Manager",,,,
103,"Donald","President ==>of<== united states",,,,

注:

  

删除==><==并添加-i.bak以获取备份和就地编辑

甚至在一次替换中:

perl -lpe '$/=undef; s/(\w+)\s+\n\n^([^\n]+)\n/$1 $2/gm;'  file

答案 5 :(得分:0)

https://stackoverflow.com/a/45420607/1745001复制代码并更改此内容:

{
    printf "Record %d:\n", ++recNr
    for (i=1;i<=NF;i++) {
        printf "    $%d=<%s>\n", i, $i
    }
    print "----"
}

到此:

/your regexp/ {
    printf "Record %d:\n", ++recNr
    for (i=1;i<=NF;i++) {
        gsub(/\n/," ",$i)
        printf "    $%d=<%s>\n", i, $i
    }
    print "----"
}

其中your regexp是您正在尝试在数据中找到的正则表达式(您在问题中提到的“特定模式”)。

与大多数(全部?)其他当前答案不同,上述内容不依赖于以,,,,结尾的输入行,也不会将整个文件读入内存,也不依赖于从任何特定值开始的换行符之后的字段,也不依赖于字段中最多只有1个空行,也不需要任何特定版本的工具等。