Question

如何在unix中找到具有特定模式的行并从中删除换行符。 假设我有一个逗号分隔文件

100,"John","Clerk",,,,  
101,"Dannis","Manager",,,,  
102,"Michael","Senior  

Manager",,,,  

103,"Donald","President of 

united states",,,,

我想要的输出是

100,"John","Clerk",,,,  
101,"Dannis","Manager",,,,  
102,"Michael","Senior Manager",,,,  
103,"Donald","President of united states",,,,

Answer 1

短 sed 解决方案：

sed -z 's/\n*//g; s/,,,,/&\n/g' file

输出：

100,"John","Clerk",,,,
101,"Dannis","Manager",,,,
102,"Michael","Senior Manager",,,,
103,"Donald","President of united states",,,,

或 awk ：

awk 'BEGIN{ RS=ORS="" }{ gsub(/\n+/," ",$0); gsub(/,,,, */,"&\n",$0); print }' file

Answer 2

尝试一次跟随awk。

awk '/^$/{next} {val=$0 ~ /^[0-9]/?(val?val ORS $0:$0):(val?val OFS $0:$0)} END{print val}' Input_file

编辑：添加非单行形式的解决方案及其解释。

awk '
/^$/{   ## Checking here if a line starts from space, if yes then do following action.
   next ## next keyword will skip all further actions here.
}
{
val=$0 ~ /^[0-9]/?(val?val ORS $0:$0):(val?val OFS $0:$0) ##creating variable named val here which will check 2 conditions if a line starts with digit then it will concatenate itself with a new line and if a line statrs with non-digit value then it will concatenate its value with a space.
}
END{         ##END block of awk code here.
   print val ##printing the value of variable named val here
}
' Input_file ## Mentioning Input_file here.

Answer 3

'TypeError: cursor.readConcern is not a function'.

Answer 4

这可能适合你（GNU sed）：

sed -r ':a;N;/^([^\n,]*,){6}/!s/\n//;ta;P;D' file

在图案空间（PS）上添加另一行，如果该行不包含6 ,，则删除换行符并重复，否则打印并删除第一行，然后重复

Answer 5

如果您不介意使用 Perl

首先删除额外的换行符：

perl -pe 's/^\n//;' file

输出：

100,"John","Clerk",,,,
101,"Dannis","Manager",,,,
102,"Michael","Senior
Manager",,,,
103,"Donald","President of
united states",,,,

然后您可以：添加新替换以删除每行的最后一个单词的换行符。为此您可以使用：

s/(\w+)\s+\n$/$1 /;

此处\w+匹配Senior和of并将其保存在$1中，您可以将其与/$1 /和一起使用，并且明显的部分是单个空格： 之后的$1

最后我们有：

perl -pe 's/^\n//;s/(\w+)\s+\n$/==>$1<== /;' file

输出：

100,"John","Clerk",,,,
101,"Dannis","Manager",,,,
102,"Michael","==>Senior<== Manager",,,,
103,"Donald","President ==>of<== united states",,,,

注：

删除==>和<==并添加-i.bak以获取备份和就地编辑

甚至在一次替换中：

perl -lpe '$/=undef; s/(\w+)\s+\n\n^([^\n]+)\n/$1 $2/gm;'  file

Answer 6

从https://stackoverflow.com/a/45420607/1745001复制代码并更改此内容：

{
    printf "Record %d:\n", ++recNr
    for (i=1;i<=NF;i++) {
        printf "    $%d=<%s>\n", i, $i
    }
    print "----"
}

到此：

/your regexp/ {
    printf "Record %d:\n", ++recNr
    for (i=1;i<=NF;i++) {
        gsub(/\n/," ",$i)
        printf "    $%d=<%s>\n", i, $i
    }
    print "----"
}

其中your regexp是您正在尝试在数据中找到的正则表达式（您在问题中提到的“特定模式”）。

与大多数（全部？）其他当前答案不同，上述内容不依赖于以,,,,结尾的输入行，也不会将整个文件读入内存，也不依赖于从任何特定值开始的换行符之后的字段，也不依赖于字段中最多只有1个空行，也不需要任何特定版本的工具等。

如何在unix中找到具有特定模式的行并从中删除新行字符

6 个答案: