这是我正在解析的csv文件的当前格式
"Street","City","Country"
"House # 3, Street "23, H, Block". Building 32", "CityName", "Country"
在这里你可以看到23, H, Block
被双引号和逗号包围 - 当我使用下面的代码解析这个文件时
while (! feof($file)) {
// provide last parameter so in case we get \ in a field it
// doesn't break the data
$row = fgetcsv($file, null, ",", '"', '"');
// so we don't send anything besides array
if (count($row) > 0) {
// if array is empty we don't pass it to further proceeding
if ($row) {
$sorted[] = array_merge($rows, $row);
}
}
}
解析将23
,h
和Block
划分为不同的元素,而它们应该是一个
这就是发生的事情
array:2 [▼
0 => array:3 [▼
0 => "Street"
1 => "City"
2 => "Country"
]
1 => array:5 [▼
0 => "House # 3, Street 23"
1 => " H"
2 => " Block". Building 32""
3 => "CityName"
4 => "Country"
]
]
虽然我想要这样
array:2 [▼
0 => array:3 [▼
0 => "Street"
1 => "City"
2 => "Country"
]
1 => array:3 [▼
0 => "House # 3, Street 23, H, Block. Building 32"
1 => "CityName"
2 => "Country"
]
]
如果我可以使用一些正则表达式模式从整个csv文件中删除不需要的引号,那将会很有帮助
答案 0 :(得分:1)
我相信您应该专注于如何正确地将行/行拆分为标记,而不是从行中删除不需要的双引号字符。
块分隔符具有","
或", "
的形式,因此用于分割该行的正则表达式将是
(?<="),\s*(?=")
请参阅DEMO正则表达式解释