我正在寻找使用PHP 5中的preg_match_all的正则表达式,它允许我用逗号分割字符串,只要逗号不存在于单引号内,允许转义单引号。示例数据将是:
(some_array, 'some, string goes here','another_string','this string may contain "double quotes" but, it can\'t split, on escaped single quotes', anonquotedstring, 83448545, 1210597346 + '000', 1241722133 + '000')
这应该会产生如下匹配:
(some_array
'some, string goes here'
'another_string'
'this string may contain "double quotes" but, it can\'t split, on escaped single quotes'
anonquotedstring
83448545
1210597346 + '000'
1241722133 + '000')
我已经尝试了很多很多正则表达式...我现在看起来像这样,虽然它不能正确匹配100%。 (它仍然在单引号内分割一些逗号。)
"/'(.*?)(?<!(?<!\\\)\\\)'|[^,]+/"
答案 0 :(得分:7)
你试过str_getcsv
吗?它完全符合您的需要而无需正则表达式。
$result = str_getcsv($str, ",", "'");
您甚至可以在早于5.3的PHP版本中实现此方法,并使用文档中a comment的此代码段映射到fgetcsv
:
if (!function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ',', $enclosure = '"', $escape = null, $eol = null) {
$temp = fopen("php://memory", "rw");
fwrite($temp, $input);
fseek($temp, 0);
$r = fgetcsv($temp, 4096, $delimiter, $enclosure);
fclose($temp);
return $r;
}
}
答案 1 :(得分:2)
在PHP 5.3之后,您可以使用str_getcsv
来避免痛苦 $data=str_getcsv($input, ",", "'");
举个例子......
$input=<<<STR
(some_array, 'some, string goes here','another_string','this string may contain "double quotes" but it can\'t split on escaped single quotes', anonquotedstring, 83448545, 1210597346 + '000', 1241722133 + '000')
STR;
$data=str_getcsv($input, ",", "'");
print_r($data);
输出此
Array
(
[0] => (some_array
[1] => some, string goes here
[2] => another_string
[3] => this string may contain "double quotes" but it can\'t split on escaped single quotes
[4] => anonquotedstring
[5] => 83448545
[6] => 1210597346 + '000'
[7] => 1241722133 + '000')
)
答案 2 :(得分:2)
通过一些后视,你可以得到一些接近你想要的东西:
$test = "(some_array, 'some, string goes here','another_string','this string may contain \"double quotes\" but, it can\'t split, on escaped single quotes', anonquotedstring, 83448545, 1210597346 + '000', 1241722133 + '000')";
preg_match_all('`
(?:[^,\']|
\'((?<=\\\\)\'|[^\'])*\')*
`x', $test, $result);
print_r($result);
给你这个结果:
Array
(
[0] => Array
(
[0] => (some_array
[1] =>
[2] => 'some, string goes here'
[3] =>
[4] => 'another_string'
[5] =>
[6] => 'this string may contain "double quotes" but, it can\'t split, on escaped single quotes'
[7] =>
[8] => anonquotedstring
[9] =>
[10] => 83448545
[11] =>
[12] => 1210597346 + '000'
[13] =>
[14] => 1241722133 + '000')
[15] =>
)
[1] => Array
(
[0] =>
[1] =>
[2] => e
[3] =>
[4] => g
[5] =>
[6] => s
[7] =>
[8] =>
[9] =>
[10] =>
[11] =>
[12] => 0
[13] =>
[14] => 0
[15] =>
)
)
答案 3 :(得分:0)
我在这里使用了一个CSV解析器,这就是它们的用途。
如果你坚持使用正则表达式,你可以使用
preg_match_all(
'/\s*" # either match " (optional preceding whitespace),
(?:\\\\. # followed either by an escaped character
| # or
[^"] # any character except "
)* # any number of times,
"\s* # followed by " (and optional whitespace).
| # Or: do the same thing for single-quoted strings.
\s*\'(?:\\\\.|[^\'])*\'\s*
| # Or:
[^,]* # match anything except commas (i.e. any remaining unquoted strings)
/x',
$subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
但是,正如你所看到的,这是丑陋的,难以维持。使用正确的工具完成工作。