使用PHP preg_replace函数替换空格,忽略带引号的字符串

时间:2010-05-24 09:00:32

标签: php string whitespace pattern-matching preg-replace

请查看以下字符串

SELECT
    column1 ,
    column2, column3
FROM
    table1
WHERE
    column1 = 'text, "FROM" \'from\\\' x' AND
    column2 = "sample text 'where' \"where\\\" " AND
    ( column3 = 5 )

我需要从字符串中删除不必要的空白字符,如:

  • ,<)等的开头结束位置删除空格
  • 删除换行符( \ r \ n )和标签页( \ t

但有一点。删除流程无法从引用字符串中删除空格,如:

  • 'text,“FROM”\'来自\\'x'
  • “示例文字”,其中'\“其中\\”“

我需要使用 PHP 功能: preg_replace($ pattern,$ replacement,$ string);

那么 $ pattern $ replacement 的价值是多少,其中$ string的值是给定的SQL

1 个答案:

答案 0 :(得分:1)

单个正则表达式模式和替换字符串字符串不起作用。您的第一步可能是对输入字符串进行标记:首先尝试匹配注释和字符串文字,然后尝试匹配空格字符和最后非空格字符。快速演示:

$text = <<<BLOCK
SELECT
    column1 ,
    column2, column3
FROM
    table1
-- a comment line ' " ...
WHERE
    column1 = 'text, "FROM" \\'from\\\\\\' x' AND
    column2 = "sample text 'where' \\"where\\\\\\" " AND
    ( column3 = 5 )
BLOCK;

echo $text . "\n\n";

preg_match_all('/
    --[^\r\n]*                # a comment line
    |                         # OR
    \'(?:\\\\.|[^\'\\\\])*\'  # a single quoted string
    |                         # OR
    "(?:\\\\.|[^"\\\\])*"     # a double quoted string
    |                         # OR
    `[^`]*`                   # a string surrounded by backticks
    |                         # OR
    \s+                       # one or more space chars
    |                         # OR
    \S+                       # one or more non-space chars
/x', $text, $matches);

print_r($matches);

产生

SELECT
    column1 ,
    column2, column3
FROM
    table1
-- a comment line ' " ...
WHERE
    column1 = 'text, "FROM" \'from\\\' x' AND
    column2 = "sample text 'where' \"where\\\" " AND
    ( column3 = 5 )

Array
(
    [0] => Array
        (
            [0] => SELECT
            [1] => 

            [2] => column1
            [3] =>  
            [4] => ,
            [5] => 

            [6] => column2,
            [7] =>  
            [8] => column3
            [9] => 

            [10] => FROM
            [11] => 

            [12] => table1
            [13] => 

            [14] => -- a comment line ' " ...
            [15] => 

            [16] => WHERE
            [17] => 

            [18] => column1
            [19] =>  
            [20] => =
            [21] =>  
            [22] => 'text, "FROM" \'from\\\' x'
            [23] =>  
            [24] => AND
            [25] => 

            [26] => column2
            [27] =>  
            [28] => =
            [29] =>  
            [30] => "sample text 'where' \"where\\\" "
            [31] =>  
            [32] => AND
            [33] => 

            [34] => (
            [35] =>  
            [36] => column3
            [37] =>  
            [38] => =
            [39] =>  
            [40] => 5
            [41] =>  
            [42] => )
        )

)

然后您可以迭代标记化的$matches数组并替换您认为合适的空格匹配。

但是你可能已经读过我已经删除的评论,更好的选择是使用一些专用的SQL解析器来执行这种标记化:我不熟练使用SQL,但我相信我的上面的演示很容易被破坏