用于在“:”之后查找大于20或小于5的字符串的正则表达式

时间:2018-01-23 13:31:36

标签: regex

我有一个大型数据集,然后我需要在Sublime文本编辑器中使用正则表达式进行清理。

我试图删除冒号(:)之后少于5个字符的任何内容,包括空格。 还试图删除超过20个字符的任何内容。

示例:

  • jshfdgl:JSS
  • oiadfgopiafdg:
  • ofdijgdf:2)
  • ogijdfogis:_ge
  • iognhif:gojdf sdofig peoji-009
  • ogijdfs:_ge 2

这些都属于正则表达式......

我也试图使用冒号后面的字母来查找小于5且大于20的字符。

尝试过很多东西,但似乎一直没有空间......

2 个答案:

答案 0 :(得分:0)

试试这个正则表达式:

(?<=:)(?:.{0,5}|.{20,})$

Click for Demo

用空白字符串替换匹配

<强>解释

  • (?<=:) - 找到紧跟:
  • 之前的位置
  • (?:.{0,5}|.{20,})
    • .{0,5} - 匹配除换行符之外的任何字符的0到5次出现
    • | - 或
    • .{20,} - 匹配除新行之外的任何字符的20次或更多次出现
  • $ - 断言字符串的结尾

答案 1 :(得分:0)

According to the advice by @Andy G (which I support), I prepared a solution, which instead of regex, uses the following perl one-liner script (to execute from the command prompt):

perl -lan -F: -e "$len = length($F[1]); printf(qq(%s:%s\n), $F[0], ($len > 5 && $len <= 20)?$F[1]:'')" inp.txt >out.txt

Explanation:

  • -lan - perl options: -l - chop input line terminator, -a - auto-split mode, -n - "looping" execution.
  • -F: - Another perl option - define auto-split separator (:). Thanks to it, input line is split, just on ":" and the result is saved in predefined array F.
  • -e "..." - The program (one-liner script) to execute.
  • inp.txt - Input file name.
  • >out.txt - Output redirection.

And now move on to the script content:

  • $len = length($F[1]); - Save length of the second "input segment" (after ":").
  • printf( ... ) - Formatted print of the output line, arguments described below.
  • qq(%s:%s\n) - Format string. qq operator is used to embed additional double quotes around the format string, between "plain" double quotes surrounding the script content.
  • $F[0] - The first string to print - first "input segment" (before ":").
  • ($len > 5 && $len <= 20)?$F[1]:'' - The second string to print. Actually it is ternary operator, decicing which string to print: If the saved length is within allowed limits then print the second "input segment" (after ":"), otherwise the instruction prints an empty string.

Due to -n option, this program is repeated for each input line.

Of course, you must have perl installed on your computer.

If you need further explanation, read about perl one-liners and maybe also about perl itself.