使用不在特定字符内的分隔符拆分字符串

时间:2012-03-30 17:51:57

标签: php regex

我有以下格式的字符串

,"value","value2","3",("this is, a test"), "3"

如果逗号不在括号内,我怎么能用逗号分隔?

编辑:抱歉轻微的问题/纠正,在括号内格式实际上是

 ,"value","value2","3",(THIS IS THE FORMAT "AND QUOTES, INSIDE"), "3"

5 个答案:

答案 0 :(得分:2)

引号已足以分隔逗号,因此您也不需要parens。如果你拿出parens,str_getcsv()就可以了。如果您无法控制源,可以自行剥离它们:

$str = str_replace('",("', '","', $str);
$str = str_replace('"), "', '", "', $str);
print_r(str_getcsv($str))

修改更新的问题:

只要文件中没有未转义的parens,你仍然可以。只需将紧密的parens转换为打开的parens(因为getcsv()只能使用单个char作为分隔符),然后使用open paren作为引号字符:

$str = str_replace(')', '(', $str);
print_r(str_getcsv($str, ',', '('));

结果:

Array
(
    [0] =>  
    [1] => "value"
    [2] => "value2"
    [3] => "3"
    [4] => THIS IS THE FORMAT "AND QUOTES, INSIDE"
    [5] =>  "3"
)

答案 1 :(得分:2)

以上解决方案工作正常,但我还有一个

preg_match_all('@(,)?("|(\())(.+?)((?(3)\)|"))(,)?@',$str,$arr);

此输出是

阵 (     [0] =>排列         (             [0] => ,“值”,             [1] => “值2”,             [2] => “3”             [3] => (“这是一个测试”),             [4] => “3”         )

[1] => Array
    (
        [0] => ,
        [1] => 
        [2] => 
        [3] => 
        [4] => 
    )

[2] => Array
    (
        [0] => "
        [1] => "
        [2] => "
        [3] => (
        [4] => "
    )

[3] => Array
    (
        [0] => 
        [1] => 
        [2] => 
        [3] => (
        [4] => 
    )

[4] => Array
    (
        [0] => value
        [1] => value2
        [2] => 3
        [3] => "this is, a test"
        [4] => 3
    )

[5] => Array
    (
        [0] => "
        [1] => "
        [2] => "
        [3] => )
        [4] => "
    )

[6] => Array
    (
        [0] => ,
        [1] => ,
        [2] => ,
        [3] => ,
        [4] => 
    )

所以$ arr [4]包含匹配

答案 2 :(得分:2)

考虑以下代码:

$str = ',"value","value2","3",(THIS IS THE FORMAT \) "AND QUOTES, INSIDE"), "3"';
$regex = '#(\(.*?(?<!\\\)\))\s*,|,#';
$arr = preg_split( $regex, $str, 0, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY );
print_r($arr);

<强>输出:

Array
(
    [0] => "value"
    [1] => "value2"
    [2] => "3"
    [3] => (THIS IS THE FORMAT \) "AND QUOTES, INSIDE")
    [4] =>  "3"
)

答案 3 :(得分:2)

这是一个简单的标记器,可用于将输入拆分为字符串和其他字符:

preg_match_all('/"(?:[^\\\\"]|\\.)*"|[^"]/', $input, $tokens)

如果要解析输入,只需迭代标记并执行所需的语法检查。您可以通过令牌开头和结尾的引号来识别字符串。

答案 4 :(得分:1)

preg_match("/,?\"(.*?)\",?/", $myString, $result);

您可以查看正则表达式here

编辑:我可以用转义引号快速思考的唯一解决方案就是替换它们并稍后再添加它们

preg_match("/,?\"(.*?)\",?/", str_replace('\"', "'", $myString), $result);