子串的正则表达式,不能以数字开头

时间:2014-12-29 07:04:23

标签: c# regex

我使用"[a-z][a-z0-9]*"查找子字符串:

  

" as4s" - 发现as4s

     

" S + SD4" - 找到s,sd4

     

"(4asd sad)" - 发现asd,伤心

     

" 10asd" - 找到asd

我需要改变这种劝告,结果将是:

  

" as4s" - 发现as4s

     

" S + SD4" - 找到s,sd4

     

"(4asd sad)" - 发现悲伤

     

" 10asd" - 什么都没找到

您可以使用此代码测试表达式:

using System.Text.RegularExpressions;

string input = "A*10+5.01E+10";
Regex r = new Regex("[a-zA-Z][a-zA-Z\d]*");
var identifiers = new Dictionary<string, string>();

MatchEvaluator me = delegate(Match m)
{
    Console.WriteLine(m);
    var variableName = m.ToString();

    if (identifiers.ContainsKey(variableName))
    {
        return identifiers[variableName];
    }
    else
    {
        i++;
        var newVariableName = "i" + i.ToString();
        identifiers[variableName] = newVariableName;
        return newVariableName;
    }
};

input = r.Replace(input, me);

2 个答案:

答案 0 :(得分:2)

您可以使用字边界来避免匹配不需要的文字,例如前面有数字等:

Regex r = new Regex("\b[a-zA-Z][a-zA-Z\d]*\b");

RegEx Demo

答案 1 :(得分:1)

(?<!\d)(\b[a-z][a-z0-9]*)

试试这个。抓住捕获。参见演示。

https://regex101.com/r/gX5qF3/7

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    \d                       digits (0-9)
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \b                       the boundary between a word char (\w)
                         and something that is not a word char
--------------------------------------------------------------------------------
    [a-z]                    any character of: 'a' to 'z'
--------------------------------------------------------------------------------
    [a-z0-9]*                any character of: 'a' to 'z', '0' to '9'
                         (0 or more times (matching the most
                         amount possible))
--------------------------------------------------------------------------------
  )                        end of \1