拆分行如果在一行中找到大写字母

时间:2016-06-15 09:20:00

标签: linux bash shell

我需要一个正则表达式,如果找到大写字母,可能会拆分行。

示例: -

SELECT

将输出视为以下

GROUP BY

5 个答案:

答案 0 :(得分:1)

此命令将在从第二次出现开始的大写字母前面加上空白之前分割行(如示例所示):

sed 's/\(\s\)\([A-Z]\)/\1\n\2/g; s/\n//'

示例:

$ echo 'line1 = JOHN levin have fun RAJESH is a good person SAM was ok'|sed 's/\(\s\)\([A-Z]\)/\1\n\2/g; s/\n//'
line1 = JOHN levin have fun 
RAJESH is a good person 
SAM was ok

答案 1 :(得分:0)

你想要的是什么?

$ line1='JOHN levin have fun RAJESH is a good person SAM was ok'
$ sed 's/[A-Z]\+/\n&/g' <<< $line1

JOHN levin have fun
RAJESH is a good person
SAM was ok

请注意,在JOHN之前添加换行符,因为它符合您的要求。避免这是另一个问题。您的要求也是:

  

我需要一个正则表达式,如果找到大写字母,可能会拆分行。

所以预期的输出应该是:

$ sed 's/\([A-Z]\)/\n\1/g' <<< $line1

J
O
H
N levin have fun
R
A
J
E
S
H is a good person
S
A
M was ok

答案 2 :(得分:0)

请尝试以下方法:

echo "<your string> | awk '{once_found = 0; for(i = 1; i < NF; i++){if($i ~/[A-Z]/){if(once_found){print "";} once_found++;} printf("%s ", $i);}print "";}'

我已将once_found放在line1 =John之间省略换行符。我不确定你真的想要那个。如果没有,只需删除once_found以及与之相关的所有内容

答案 3 :(得分:0)

另一种基于gawk的方法:

$ a='line1 = JOHN levin have fun RAJESH is a good person SAM was ok'

$ awk '{ORS=((NR==1)?"":"\n")RT}1' RS='[A-Z]+' <<< "$a"
line1 = JOHN levin have fun 
RAJESH is a good person 
SAM was ok
  1. 使用RS=[A-Z]+
  2. 拆分输入
  3. 对于第1行,使用ORS=RT,对于其他行,请使用ORS="\n"RT
  4. 打印
  5. 请注意,sed是执行您要执行的操作的正确工具。这个答案仅用于说明。如果您需要任何复杂的算法,可以像这样使用awk

答案 4 :(得分:0)

将grep与-E xtented正则表达式一起使用,-o仅使用匹配项:

$ line="JOHN levin have fun RAJESH is a good person SAM was ok"
$ grep -oE '[A-Z]+[^A-Z]+?' <<< "$line" 
JOHN levin have fun 
RAJESH is a good person 
SAM was ok