我想读取csv文件并将其解析为数组但总是失败

时间:2017-07-14 04:48:40

标签: php arrays csv

我的csv文件链接:https://drive.google.com/file/d/0B-Z58iD3By5wb2R2TnV0Rjc3Zzg/view

我已经阅读了很多参考资料,但我不能将我的csv分隔为"," (分隔符不能正常工作)。有没有解决方案如何从csv文件中获取这样的数组:

`Array[0]=>
(

['username'] => Lexsa,

['date'] => 12/07/2017,

['retweet'] => null,

)`

`Array[1]=>
 (

 ['username'] => any,
 ['date'] => 12/07/2017,

 ['retweet'] => null
 )`



function csv_to_array($filename='', $delimiter=',')
{
if(!file_exists($filename) || !is_readable($filename))
    return FALSE;

$header = NULL;
$data = array();
if (($handle = fopen($filename, 'r')) !== FALSE)
{
    while (($row = fgetcsv($handle, 1000, $delimiter)) !== FALSE)
    {
        if(!$header)
            $header = $row;
        else
            $data[] = array_combine($header, $row);
    }
    fclose($handle);
}
return $data;
} 

我尝试使用很多引用,但结果总是这样代码不会将行与","分开:

  

数组([0] =>数组(["用户名","日期","转推","收藏", "文本""地理""提到""#标签"" ID""永久& #34;] =>" Lexsa911"," 01/12/2016 0:05",0.0,0.0," Kecelakaan - Kecelakaan Pesawat yang Melibatkan Klub-Klub Sepakbola http:// ht.ly/1IdL306EzDH",,,," 8,04E +17"," {{}}}")

2 个答案:

答案 0 :(得分:2)

这是我用less或gedit打开你的tes.csv时得到​​的结果:

"""username"",""date"",""retweets"",""favorites"",""text"",""geo"",""mentions"",""hashtags"",""id"",""permalink"""
"""Lexsa911"",""01/12/2016 0:05"",0.0,0.0,""Kecelakaan - Kecelakaan Pesawat yang Melibatkan Klub-Klub Sepakbola http:// ht.ly/1IdL306EzDH"",,,,""8,04E+17"",""https://twitter.com/Lexsa911/status/804008435020865536"""
"""Widya_Davy"",""01/12/2016 0:05"",0.0,0.0,""Kecelakaan - Kecelakaan Pesawat yang Melibatkan Klub-Klub Sepakbola http:// ow.ly/h1Eh306EzHk"",,,,""8,04E+17"",""https://twitter.com/Widya_Davy/status/804008434588876803"""
"""redaksi18"",""01/12/2016 0:05"",0.0,0.0,""Klub Brasil Korban Kecelakaan Pesawat Didaulat Jadi Juara http:// beritanusa.com/index.php?opti on=com_content&view=article&id=39769:klub-brasil-korban-kecelakaan-pesawat-didaulat-jadi-juara&catid=43:liga-lain&Itemid=112 … pic.twitter.com/1K7OlZSX83"",,,,""8,04E+17"",""https://twitter.com/redaksi18/status/804008416188338176"""
"""JustinBiermen"",""01/12/2016 0:06"",0.0,0.0,""Video LUCU Kecelakaan Yg Sangat Koplak http://www. youtube.com/watch?v=pQFOY7 AdXck …"",,,,""8,04E+17"",""https://twitter.com/JustinBiermen/status/804008714738880512"""

所以问题不是分隔符,而是封闭空间。如您所见,每一行都用引号括起来。因此整行被认为是列。

我建议修复csv,例如删除引号直到一行看起来像

"username","date","retweets","favorites","text","geo","mentions","hashtags","id","permalink"

如果由于某种原因无法做到这一点,请预先处理csv以进行清理:

print_r(
    array_map(
        function($line) {
            $single_quoted_line = str_replace(['"""', '""'], '"', $line);
            return str_getcsv($single_quoted_line);
        },
        file("tes.csv")
    )
);

答案 1 :(得分:1)

您的CSV格式化为每个字段左右两个"字符,然后每行左右包含一个"字符。因此,CSV中的每一行都会以string的形式读入。这就是为什么你的结果是一个关联数组,整个标题作为一个string作为键,与键相关的值也是整行作为单个string。< / p>

尝试重新格式化CSV,以便每个字段在单个"字符中左右括起,并从每行的开头和结尾删除其他"个字符。然后,您的代码应该会产生预期的结果。

如果您无法控制CSV的格式,则在使用fgetcsv()进行解析之前,您需要先进行一些卫生处理。