使用正则表达式将数学公式拆分为数组

时间:2019-04-02 16:58:43

标签: regex ruby

我正在寻找一个扁平的字符串公式并将其拆分为一个数组,并根据一些因素进行划分。在括号中稍作停留并寻求帮助。

我一直在使用正则表达式扫描以及一些过滤器来尝试获取结果数组。

我目前的测试是这样的

 describe 'split algorithm' do

      it 'can split a flat algorithm' do
        algo = 'ABC * DEF * GHI Round(3) = JKL * MNO * PQR Round(0) = SAVE'
        actual = split_algo(algo)
        expected = ['ABC', '* DEF', '* GHI', 'Round(3)', '= JKL', '* MNO', '* PQR', 'Round(0)', '= SAVE']
        expect(actual).to eq expected
      end

      it 'can split an algorithm with parenthesis' do
        algo = '(ABC + DEF + (GHI * JKL)) - ((MNO + PQR + (STU * VWX)) * YZ) Round(0) + SUM(AAA) = SAVE'
        actual = split_algo(algo)
        expected = ['(', 'ABC', '+ DEF', '+', '(', 'GHI', '* JKL', ')', ')', '-', '(', '(', 'MNO', '+ PQR', '+', '(', 'STU', '* VWX', ')', ')', '* YZ', ')', 'Round(0)', '+ SUM', '(', 'AAA', ')', '= SAVE']
        expect(actual).to eq expected
      end

end

使用以下代码,我可以使上半部分通过:

def split_algo(algorithm)
   pattern = /(?:(\ (\*\ |\+\ |\-\ |\\\ |\=\ )\S*))|(\S*)/
   matches = algorithm.scan(pattern)
   matches.each_with_index { |match, index| matches[index]=match.compact }
   arr = []
   matches.each do |match|
     arr << match.max_by(&:length).strip
   end
   arr.delete('')
   arr
end

我尝试修改pattern以接受这样的括号匹配器:

pattern = (\(|\))|(?:(\ (\*\ |\+\ |\-\ |\\\ |\=\ )\S*))|(\S*)

但这只能捕获公式开头的括号。

2 个答案:

答案 0 :(得分:0)

我完成了以下似乎可行的工作:

split_paren(arr)的末尾添加了对新方法split_algo的调用。

def split_paren(algo_arr)
  pattern = /Round\(\d*\)/
  arr = []
  algo_arr.each do |step|
    f = step.split(/(\(|\))/) unless step =~ pattern
    f.delete('') if f.class == Array
    f.nil? ? arr << step : f.each{|s| arr << s.strip}
  end
  arr
end

如果有人想以更好的方式做出回应,请随时回应。否则,我会接受我的回答,然后在这里稍作结束。

答案 1 :(得分:0)

我们可以定义以下正则表达式。

this-persons-name

在第一个示例中,我们有以下内容。

this persons name

在第二个示例中,我们有以下内容。

R = /
    # split after an open paren if not followed by a digit
    (?<=\()      # match is preceded by an open paren, pos lookbehind
    (?!\d)       # match is not followed by a digit, neg lookahead
    [ ]*         # match >= 0 spaces
    |            # or
    # split before an open paren if paren not followed by a digit
    (?=          # begin pos lookahead
      \(         # match a left paren...
      (?!\d)     # ...not followed by a digit, neg lookahead
    )            # end pos lookahead
    [ ]*         # match >= 0 spaces        
    |            # or
    # split before a closed paren if paren not preceded by a digit
    (?<!\d)      # do not follow a digit, neg lookbehind
    (?=\))       # match a closed paren, pos lookahead
    [ ]*         # match >= 0 spaces        
    |            # or
    # split after a closed paren
    (?<=\))      # match a preceding closed paren, pos lookbehind
    [ ]*         # match >= 0 spaces        
    |            # or
    # match spaces not preceded by *, = or + and followed by a letter 
    (?<![*=+\/-]) # match is not preceded by one of '*=+\/-', neg lookbehind
    [ ]+         # match one or more spaces
    |            # or
    # match spaces followed by a letter 
    [ ]+         # match one or more spaces
    (?=\()       # match a left paren, pos lookahead
    /x           # free-spacing regex definition mode

常规表达通常如下。

algo1 = 'ABC * DEF * GHI Round(3) = JKL * MNO * PQR Round(0) = SAVE'
expected1 = ['ABC', '* DEF', '* GHI', 'Round(3)', '= JKL', '* MNO',
             '* PQR', 'Round(0)', '= SAVE']
algo1.split(R) == expected1
  #=> true

在自由行距模式下,我在字符类(algo2 = '(ABC + DEF + (GHI * JKL)) - ((MNO + PQR + (STU * VWX)) * YZ) Round(0) + SUM(AAA) = SAVE' expected2 = ['(', 'ABC', '+ DEF', '+', '(', 'GHI', '* JKL', ')', ')', '-', '(', '(', 'MNO', '+ PQR', '+', '(', 'STU', '* VWX', ')', ')', '* YZ', ')', 'Round(0)', '+ SUM', '(', 'AAA', ')', '= SAVE'] algo2.split(R) == expected2 #=> true )中加上空格;否则将在计算表达式之前将其删除。常规地编写正则表达式是不必要的。