如何使用正则表达式选择特定数量的字符单词

时间:2013-03-22 11:02:26

标签: regex text

我的文字如下。

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum 
has been the industry's standard dummy text ever since the fivec harword 1500s, when an unknown printer 
took a galley of type and scrambled it to make a type specimen fivec harword book. It has survived not
only five centuries, but also the leap into electronic typesetting, remaining essentially 
unchanged. It was popularised in the 1960s with the release of fivec harword Letraset sheets containing 
Lorem Ipsum passages, and more recently with desktop publishing software like Aldus 
PageMaker including versions of Lorem Ipsum.

以下是我需要的正则表达式:

1-选择五个字符。

2-在第一步后选择一个空格。

3-第二步后选择七个字符。

它应该捕获所有fivec harword个字符串。我怎么能这样做?

3 个答案:

答案 0 :(得分:2)

使用这个:

\b\w{5}\s\w{7}\b

<强>解释

The regular expression:

(?-imsx:\b\w{5}\s\w{7}\b)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
  \w{5}                    word characters (a-z, A-Z, 0-9, _) (5
                           times)
----------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
  \w{7}                    word characters (a-z, A-Z, 0-9, _) (7
                           times)
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

答案 1 :(得分:1)

这应该可以解决问题

(^|\W)\w{5}\s\w{7}($|\W)

(^|\W)字符串的开头或非单词字符。

\w{5}一串5个字的字符

\s空格

\w{7}一串7个字的字符

($|\W)字符串的结尾或非单词字符

如果您特别想要字符串周围的空格(而不是标点符号等),请将\W替换为\s

答案 2 :(得分:0)

试试这个

\b[a-zA-Z]{5}\s[][a-zA-Z]{7}\b

\ b表示边界

[a-zA-Z]所有阿尔法投注

{5}包含前一个表达式的5个字符

\ s单个空格