在awk中有条件地加入线

时间:2013-08-07 07:48:41

标签: awk

我正在寻找awk代码来加入从PDF粘贴的行。加入应按照以下规则进行:如果行中的最后一个字符不是句点.,则应在行中添加空格字符,并将下一行连接到该行。

示例输入文本(在文件中):

In a perfect school, students would treat each other with affection and
respect. Differences would be tolerated, and even welcomed. Kids would
become more popular by being kind and supportive. Students would go out
of their way to make sure one another felt happy and comfortable. But most
schools are not perfect. Instead of being places of respect and tolerance,
they are places where the hateful act of bullying is widespread.

Students have to deal with all kinds of problems in schools. There are
the problems created by difficult classes, by too much homework, or by
personality conflicts with teachers. There are problems with scheduling
the classes you need and still getting some of the ones you want. There
are problems with bad cafeteria food, grouchy principals, or overcrowded
classrooms. But one of the most difficult problems of all has to do with a
terrible situation that exists in most schools: bullying.

预期产出:

  

在一所完美的学校里,学生们会互相爱着对方   和尊重。差异是可以容忍的,甚至受到欢迎。童装   通过善良和支持会变得更受欢迎。学生会的   尽力确保彼此感到幸福   自在。但大多数学校并不完美。而不是地方   尊重和宽容,他们是仇恨行为的地方   欺凌很普遍。

     

学生必须处理学校的各种问题。有   困难的课程,太多的家庭作业或者   人格与教师发生冲突。有问题   安排你需要的课程,并仍然得到你的一些   想。自助餐厅的食物很糟糕,不满的校长,   或过度拥挤的教室。但是最困难的问题之一   所有这些都与大多数学校存在的可怕情况有关:   欺凌。

(预期输出的每一段都在一行上。大概:段落之间用空行隔开。)

2 个答案:

答案 0 :(得分:0)

这可能就足够了:

awk -v ORS= '!NF{$NF="\n"} NF{ $NF = $NF ($NF~/\.$/?"\n":" ")} 1' input

答案 1 :(得分:0)

如果您的输入文件段落实际上是用空行分隔的,那么您只需要:

awk -v RS= -v ORS='\n\n' '{$1=$1}1' file