我想使用unix命令编写一个正则表达式,该命令会将不确认的所有字符串标识为以下格式
First Leter is UpperCase
Followed by any number of letters
Underscore
Followed by UpperCase Letter
Followed by any number of letters
Underscore
and so on .............
下划线数量可变
So valid ones are Invalid ones are
Alpha_Beta_Gamma alph_Beta_Gamma
Alpha_Beta_Gamma_Delta Alpha_beta_Gamma
Alppha_Beta Alpha_beta
Aliph_Theta_Pi_Chi_Ming Alpha_theta_Pi_Chi_Ming
答案 0 :(得分:4)
grep
有一个-v
选项可以反转匹配(即返回不匹配的行)。 -E
选项将grep置于extended-regexp
模式(允许+
并且括号在模式中未转义。)
您可以使用的模式是(为清晰起见而分解):
^ # beginning of string
[A-Z] # a single uppercase letter
[a-z]* # zero or more lowercase letters
( # start a group
_ # an underscore
[A-Z] # a single uppercase letter
[a-z]* # zero or more lowercase letters
)+ # close the group and it can appear one or more times
$ # end of string
假设您有一个文件test.dat
,其中包含您问题中的8个字符串:
grep -E -v "^[A-Z][a-z]*(_[A-Z][a-z]*)+$" test.dat
返回:
alph_Beta_Gamma
Alpha_beta_Gamma
Alpha_beta
Alpha_theta_Pi_Chi_Ming