如何解析一行并导致数组 - 卡住了

时间:2012-05-16 18:21:05

标签: php regex arrays preg-match

这是我尝试做的事情:

isNumeric(right(trim(contract_id),1))

isNumeric
    right
        trim
            contract_id
        1

isNumeric(right(trim(contract_id),1), bob, george(five(four, two)))

isNumeric
    right
        trim
            contract_id
        1
    bob
    george
        five
            four
            two

所以基本上它需要一条线(Let'say trim(var))并将它的数组(array(trim => array(var))。

我尝试使用正则表达式和strpos但没有结果......我需要帮助。感谢。

1 个答案:

答案 0 :(得分:1)

首先,一个血统解析器总能给你更好的控制 整个正则表达式解决方案可能使您跳过错误或至少让您继续 而不是废话。

您的格式非常简单,可以执行内部递归的引擎可以在 至少得到一个外面的比赛。使用语言递归,您可以重新输入该正则表达式 使您能够解析核心。

我不是php专家,但如果它支持正则表达式递归和语言级 eval()你就是 将能够将数组构造注入源文本 然后eval该字符串以创建一个嵌套的数组图像,并带有参数。

我实际上将您的文本转换为大约12行Perl中的数组,但添加到了 当它有间断的时候。

这是一个Perl示例。它愚蠢到可读。它可能会给你一些灵感,在PHP中尝试它(如果它可以做这些事情)。就像我说我不是PHP专家。

  use Data::Dumper;

  my $str = '
    asdf("asg")
    isNumeric(right(trim(contract_id),1))
    var = \'aqfbasdn\'
    isNumeric(right(trim ( ,contract_id,),-1, j( ) ,"  ", bob, george(five(four, two))))
  ';

  my $func      = '\w+';           # Allowed characters (very watered down)
  my $const     = '[\w*&^+-]+';
  my $wspconst  = '[\w*&^+\s-]+';

  my $GetRx = qr~
    \s*
    (                       # 1 Recursion group
       (?:
           \s* ($func) \s* 
           [(]
              (?:  (?> (?: (?!\s*$func\s*[(] | [)] ) . )+ ) 
                 | (?1)                                         
              )*                                               
           [)]
       )
     )                                                 
  ~xs;

  my $ParseRx = qr~
    (                        # 1 Recursion group
       (?:
           \s* ($func) \s*                                    # 2 Function name
           [(]
           (                                                  # 3 Function core
              (?:  (?> (?: (?!\s*$func\s*[(] | [)] ) . )+ ) 
                 | (?1)                                         
              )*                                               
           )                                                   
           [)]
                                         # OR..other stuff
                                         # Note that this block of |'s is where               
                                         # to put code to parse constants, strings,
                                         # delimeters, etc ... Not much done, but
                                         # here is where that goes.
                                         # -----------------------------------------
         |  \s*["'] ($wspconst) ["']\s*      # 4,5 Variable constants
         | \s* ($const) \s* 
                                         # Lastly, accept empty parameters, if
         | (?<=,)                        # a comma behind us,
         | (?<=^)(?!\s*$)                # or beginning of a new 'core' if actually a paramater.
       )       
     )                                                 
  ~xs;

##
  print "Source string:\n$str\n";
  print "=======================================\n";
  print "Searching string for functions ...\n";
  print "=======================================\n\n";


  while ($str =~ /$GetRx/g) {
      print "------------------\nParsing:\n$1\n\n";
      my $res = parse_func($1);
      print "String to be eval()'ed:\n$res\n\n";

      my $hashref = eval $res.";";
      print "Hash from eval()'ed string:\n", Dumper( $hashref ), "\n\n";
  }

###
  sub parse_func
  {
      my ($core) = @_;
      $core =~ s/$ParseRx/ parse_callback($2, $3, "$4$5") /eg;
      return $core;
  }

  sub parse_callback
  {
      my ($fname, $fbody, $fconst) = @_;
      if (defined $fbody) {
          return "{'$fname'=>[" . (parse_func( $fbody )) . "]}";
      }
      return "'$fconst'"
  }

输出

Source string:

    asdf("asg")
    isNumeric(right(trim(contract_id),1))
    var = 'aqfbasdn'
    isNumeric(right(trim ( ,contract_id,),-1, j( ) ,"  ", bob, george(five(four, two))))

=======================================
Searching string for functions ...
=======================================

------------------
Parsing:
asdf("asg")

String to be eval()'ed:
{'asdf'=>['asg']}

Hash from eval()'ed string:
$VAR1 = {
          'asdf' => [
                      'asg'
                    ]
        };


------------------
Parsing:
isNumeric(right(trim(contract_id),1))

String to be eval()'ed:
{'isNumeric'=>[{'right'=>[{'trim'=>['contract_id']},'1']}]}

Hash from eval()'ed string:
$VAR1 = {
          'isNumeric' => [
                           {
                             'right' => [
                                          {
                                            'trim' => [
                                                        'contract_id'
                                                      ]
                                          },
                                          '1'
                                        ]
                           }
                         ]
        };


------------------
Parsing:
isNumeric(right(trim ( ,contract_id,),-1, j( ) ,"  ", bob, george(five(four, two))))

String to be eval()'ed:
{'isNumeric'=>[{'right'=>[{'trim'=>['' ,'contract_id','']},'-1',{'j'=>[ ]} ,'  ','bob',{'george'=>[{'five'=>['four','two']}]}]}]}

Hash from eval()'ed string:
$VAR1 = {
          'isNumeric' => [
                           {
                             'right' => [
                                          {
                                            'trim' => [
                                                        '',
                                                        'contract_id',
                                                        ''
                                                      ]
                                          },
                                          '-1',
                                          {
                                            'j' => []
                                          },
                                          '  ',
                                          'bob',
                                          {
                                            'george' => [
                                                          {
                                                            'five' => [
                                                                        'four',
                                                                        'two'
                                                                      ]
                                                          }
                                                        ]
                                          }
                                        ]
                           }
                         ]
        };