正则表达式替换双引号,保持嵌套引号不变

时间:2011-11-10 10:30:58

标签: php regex

我想用一些其他字符替换csv记录的每个双引号,比如@#@,保持内部双引号不变。

e.g。考虑以下记录

123453,"The NFL is promoting the importance of annual mammogram screenings for women over 40 in the prevention of breast cancer through their "A Crucial Catch" campaign.","Pittsburgh Steelers","NFL"

从这条记录中我想用@#@替换每个字段的双引号&结束使它成为

123453,@#@The NFL is promoting the importance of annual mammogram screenings for women over 40 in the prevention of breast cancer through their "A Crucial Catch" campaign.@#@,@#@Pittsburgh Steelers@#@,@#@NFL@#@

请注意“A Crucial Catch”没有变化,因为它已经在内部已经开始双引号

3 个答案:

答案 0 :(得分:1)

我最近对评论进行了投票,因为你应该接受你的问题的答案,这些答案有很好的答案(我在那里看到了一对)......但这是一个可能的解决方案:

<?php

$orig = '123453,"The NFL is promoting the importance of annual mammogram screenings for women over 40 in the prevention of breast cancer through their "A Crucial Catch" campaign.","Pittsburgh Steelers","NFL"';

$cols = explode(',', $orig);

function replace_end_quotes($val) {
    return preg_replace('#(^"|"$)#', "@#@", $val);
}

echo implode(",", array_map("replace_end_quotes", $cols));

如@ socha23的评论中所述,如果其中一个字段中有逗号,我的解决方案将无效。但是,如果上面的行实际上被格式化为有效的CSV数据,那么使用类似str_getcsv的东西来代替爆炸就可以了。

答案 1 :(得分:0)

为什么不循环浏览文件并创建一个重构此文件的字符串。

虽然效率不高,但你可以试试......

$out = "";

if (($handle = fopen("test.csv", "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        $arr = array();
        for ($i = 0; $i < count($data); $i++) {
            if (!ctype_digit($data[$i])) {
                $data[$i] = '@#@' . $data[$i] . '@#@';
            }
            $arr[] = $data[$i];
        }
        $out .= implode("", $arr) . "\n";
    }
    fclose($handle);
}

// Write $out to file or whatever

答案 2 :(得分:0)

您可以搜索

"(?=,|$)|(?<=^|,)"

并将其替换为@#@。此正则表达式查找以逗号开头或后跟的引号(或字符串的开头/结尾)。

所以,在PHP中:

$result = preg_replace('/"(?=,|$)|(?<=^|,)"/', '@#@', $subject);

更改

123453,"The NFL is promoting "A Crucial Catch".","Pittsburgh Steelers","NFL"

123453,@#@The NFL is promoting "A Crucial Catch".@#@,@#@Pittsburgh Steelers@#@,@#@NFL@#@