包含数字,字母,表达式和方括号的公式的模式

时间:2019-06-28 10:38:54

标签: c# regex parsing

我正在尝试为以下示例的公式编写一个正则表达式。

  1. C=A+B => Output for match will be {A, +, B}
  2. D= C+50 => Output for match will be{C, +, 50}
  3. E = (A+B)*C -100 => Output for match will be{(, A, +, B, ), *, C, -, 100}

我尝试使用正则表达式

[A-Z(\d*)*+/-]

此输出为{A, +, 5, 0}

但是它没有为A+50

提供正确的输出

2 个答案:

答案 0 :(得分:4)

我建议使用 FSM (有限状态机),而不要使用正则表达式。这里有3个状态:

  1. 既不是变量也不是数字0
  2. 在变量1
  3. 在编号2

代码:

private static IEnumerable<string> Parse(string formula) {
  int state = 0;

  StringBuilder buffer = new StringBuilder();

  foreach (var c in formula) {
    if (state == 0) { // neither var nor number
      if (char.IsWhiteSpace(c))
        continue;

      if (char.IsDigit(c)) {
        buffer.Append(c);
        state = 2;
      }
      else if (char.IsLetter(c)) {
        buffer.Append(c);
        state = 1;
      } 
      else 
        yield return c.ToString();
    }
    else if (state == 1) { // within variable
      if (char.IsDigit(c) || char.IsLetter(c))
        buffer.Append(c);
      else {
        yield return buffer.ToString();
        buffer.Clear(); 

        state = 0;

        if (!char.IsWhiteSpace(c))
          yield return c.ToString();
      }
    }
    else if (state == 2) { // within number
      if (char.IsDigit(c))
        buffer.Append(c);
      else if (char.IsLetter(c)) {
        // 123abc we turn into 123 * abc
        yield return buffer.ToString();
        buffer.Clear();

        state = 1; 

        yield return "*";

        buffer.Append(c);
      }
      else {
        yield return buffer.ToString();
        buffer.Clear();

        state = 0;

        if (!char.IsWhiteSpace(c))
          yield return c.ToString();
      } 
    }
  } 

  if (buffer.Length > 0)
    yield return buffer.ToString();
}

演示:

  string[] tests = new string[] {
    "C=A+B",
    "D= C+50",
    "E = (A+B)*C -100",
  };

  string result = string.Join(Environment.NewLine, tests
    .Select(test => new {
      formula = test,
      parsed = Parse(test)
        .SkipWhile(term => term != "=") // we don't want "C = " or alike part
        .Skip(1)
    })
    .Select(test => $"{test.formula,-20} => {string.Join(", ", test.parsed)}"));

 Console.Write(result);

结果:

C=A+B                => A, +, B
D= C+50              => C, +, 50
E = (A+B)*C -100     => (, A, +, B, ), *, C, -, 100

答案 1 :(得分:1)

对各个项目(例如模式)使用|(或)

\d+|\W|\w

转换为任何数字或任何非字母字符或任何字母字符。