如何从文件中分离SQL查询列表?

时间:2010-08-27 23:39:07

标签: php mysql regex

如果所有查询都以;结尾,我可以通过此字符爆炸,但在字段中出现;时该怎么办?

e.g。

[...]Select * From my_data where idk=';';\nSelect [...]  

[...]Select * From my_data where idk=';\n';Select [...]

我的文件包含各种类型的查询,包括INSERT,并且可能有类似上面显示的语法变体,其中;后面有一个新行,有时在字段内。

如何处理这个问题?

explode等PHP函数会失败,eregipreg_match会有效吗?

3 个答案:

答案 0 :(得分:2)

我建议编写一个非常简单的解析器。解析器的工作方式类似于状态机,状态机将对字符进行操作。基本上,下面的状态机会吃掉字符,直到找到;不在单引号分隔的字段内。

// no guarantees this is a fast or efficient one-liner
// PHP isn't the the greatest language for this sort of thing
$chars = str_split(implode("\n", file('filename.txt')));
$state = 0; // 0 = not in field, 1 = in field, 2 = in field, escaped char    
$query = "";
// loop over all characters in the file
foreach($c in $chars){
    // no matter what, append character to current query
    $query .= $c;

    // now for the state machine
    switch( $state ){
        case 0:
            if( $c == "'" ){
               $state = 1;
            }else if( $c == ";" ){
               // have a full query, do something with it
               // say, write $query to file
               // now reset $query
               $query = "";
            }
            break;
        case 1:
            if( $c == "'" ){
                // if the current character is an unescaped single quote
                // we have exited this field (so back to state 0)
                $state = 0;
            }else if( $c == "\\" ){
                // we found an backslash and so must temporarily
                // sit in a different state (avoids the sequence \')
                // and deals appropriately with \\'
                $state = 2;
            }
            break;
        case 2:
            // we can escape any char, to get here we were in a field
            // so to a field we must return
            $state = 1;
    }
}

答案 1 :(得分:0)

可靠而一般吗?您将不得不编写一个解析器,它不仅理解SQL,而且还理解MySQL的非标准SQL变体的复杂性。例如,它必须应对:

  • 具有\的单引号字符串 - 转义,因此\'不会结束字符串(但\\'会这样做),除非使用了NO_BACKSLASH_ESCAPES sql_mode选项;
  • 双引号字符串是字符串文字,具有类似的转义规则,除非打开ANSI_QUOTES;
  • 反引用的模式名称(双引号应该做什么);
  • /* ... */#以及--这样的评论,除非前面有空格,否则不会发表评论

这是一个巨大的痛苦,这就是为什么通常最好避免多个SQL语句粘在一起。

答案 2 :(得分:0)

没有那么多实际的陈述,那么如果你在\s*statement之后爆炸但是你添加了所有陈述呢?

...示例

$queries = <<<END
select * from table where text=";";


insert into table(adsf,asdf,fff) values(null,'text;','adfasdf');
update table set this="adasdf;";

alter table add column;
select * from table where text=";";


insert into table(adsf,asdf,fff) values(null,'text;','adfasdf');
update table set this="adasdf;";

alter table add column;
END;

$statements = preg_split( '~;\s*(select.*?)|;\s*(insert.*?)|;\s*(update.*?)|;\s*(alter.*?)~i', $queries, null, PREG_SPLIT_DELIM_CAPTURE );

$good_statements[] = array_shift( $statements );

foreach( $statements as $statement ) {

    if ( $statement == '' ) continue;
    if ( !($i % 2) ) {
        $statement_action = $statement;
        echo $statement;
        $i++;
    }
    else {
        $good_statements[] = sprintf( "%s %s", trim( $statement_action ), trim( $statement ) );
        $i++;
    }

}


print_r($good_statements);