我正在阅读一个CSV文件但是有些值没有被转义,所以PHP读错了。这是一个不好的行的例子:
“635”,“”,“AUBREY R. PHILLIPS(1920-) - 粉彩描绘小屋 一个陡峭的河谷,可能是北威尔士,签署并注明日期 2000年,框架,66厘米×48厘米。另一个乡村景观,名为verso “收获时间,萨默塞特”签名并注明日期'87,框架,69厘米,49厘米。 (2)NB - Aubrey Phillips是伍斯特郡的一位艺术家 Stourbridge艺术学院。“,”40“,”60“,”WAT“,”绘画,版画和版画 水彩”,
你可以看到收获时间,萨默塞特有引号,导致PHP认为它是一个新值。
当我在每一行上执行print_r()时,虚线最终看起来像这样:
Array
(
[0] => 635
[1] =>
[2] => AUBREY R. PHILLIPS (1920- ) - Pastel depicting cottages in a steep sided river valley, possibly North Wales, signed and dated 2000, framed, 66cm by 48cm. another of a rural landscape, titled verso Harvest Time
[3] => Somerset" signed and dated '87
[4] => framed
[5] => 69cm by 49cm. (2) NB - Aubrey Phillips is a Worcestershire artist who studied at the Stourbridge School of Art."
[6] => 40
[7] => 60
[8] => WAT
[9] => Paintings, prints and watercolours
[10] =>
)
这显然是错误的,因为它现在包含比其他正确行更多的数组元素。
这是我正在使用的PHP:
$i = 1;
if (($file = fopen($this->request->data['file']['tmp_name'], "r")) !== FALSE) {
while (($row = fgetcsv($file, 0, ',', '"')) !== FALSE) {
if ($i == 1){
$header = $row;
}else{
if (count($header) == count($row)){
$lots[] = array_combine($header, $row);
}else{
$error_rows[] = $row;
}
}
$i++;
}
fclose($file);
}
将错误数量的行放入$error_rows
,其余行放入大$lots
数组。
我该怎么做才能解决这个问题?感谢。
答案 0 :(得分:1)
如果您知道您将始终获得条目0和1,并且数组中的最后5个条目始终是正确的,那么它只是由于未转义的机箱字符而被“损坏”的描述性条目,那么您可以提取前2和后5使用array_slice(),implode()将余数重新放回单个字符串(恢复丢失的引号),并正确重建数组。
$testData = '" 635"," ","AUBREY R. PHILLIPS (1920- ) - Pastel depicting cottages in a steep sided river valley, possibly North Wales, signed and dated 2000, framed, 66cm by 48cm. another of a rural landscape, titled verso "Harvest Time, Somerset" signed and dated \'87, framed, 69cm by 49cm. (2) NB - Aubrey Phillips is a Worcestershire artist who studied at the Stourbridge School of Art.","40","60","WAT","Paintings, prints and watercolours",';
$result = str_getcsv($testData, ',', '"');
$hdr = array_slice($result,0,2);
$bdy = array_slice($result,2,-5);
$bdy = trim(implode('"',$bdy),'"');
$ftr = array_slice($result,-5);
$fixedResult = array_merge($hdr,array($bdy),$ftr);
var_dump($fixedResult);
结果是:
array
0 => string ' 635' (length=4)
1 => string ' ' (length=1)
2 => string 'AUBREY R. PHILLIPS (1920- ) - Pastel depicting cottages in a steep sided river valley, possibly North Wales, signed and dated 2000, framed, 66cm by 48cm. another of a rural landscape, titled verso Harvest Time" Somerset" signed and dated '87" framed" 69cm by 49cm. (2) NB - Aubrey Phillips is a Worcestershire artist who studied at the Stourbridge School of Art.' (length=362)
3 => string '40' (length=2)
4 => string '60' (length=2)
5 => string 'WAT' (length=3)
6 => string 'Paintings, prints and watercolours' (length=34)
7 => string '' (length=0)
不完美,但可能还不够
另一种方法是让生成csv的人正确地逃离他们的机箱
答案 1 :(得分:1)
如果你能逃脱“像这样的文字:\”
并且在fgetcsv中使用指定转义字符
fgetcsv($file, 0, ',', '"','\');
答案 2 :(得分:0)
这是一个很长的镜头,所以不要认真对待我。
我在文中看到一个模式,你要忽略的所有','后面都有一个空格。 用'FUU'或其他独特的东西搜索和替换','。
现在解析csv文件。它可能会得到正确的格式。您只需将'FUU'替换回','
:)
答案 3 :(得分:0)
您可能正在以行数组的形式读取CSV文件的内容,然后在逗号上分割每一行。由于某些字段还包含逗号,因此失败。可以帮助你的一个技巧是寻找","
,这将指示一个字段分隔符,不太可能(但不是不可能)在字段内发生。
<?php
$csv = file_get_contents("yourfile.csv");
$lines = split("\r\n", $csv);
echo "<pre>";
foreach($lines as $line)
{
$line = str_replace("\",\"", "\"@@@\"", $line);
$fields = split("@@@", $line);
print_r($fields);
}
echo "</pre>";
?>
答案 4 :(得分:0)
$csv = explode(' ', $csv);
foreach ($csv as $k => $v) if($v[0] == '"' && substr($v, -1) == '"') {
$csv[$k] = mb_convert_encoding('“' . substr($v, 1, -1) . '”', 'UTF-8', 'HTML-ENTITIES');
}
$csv = implode(' ', $csv);
$csv = str_getcsv($csv);