php str_getcsv打破选项卡分隔列表,没有机箱和单个双引号

时间:2013-11-16 18:34:12

标签: php csv nosql tsv handlersocket

我使用str_getcsv来解析从nosql查询返回的制表符分隔值但是我遇到了问题而且我找到的唯一解决方案是不合逻辑的。

这里有一些示例代码可供演示(仅供参考,显示标签在此处显示时似乎未被保留)...

$data = '0  16  Gruesome Public Executions In North Korea - 80 Killed       http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata        "North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou...    1384357511  http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw   0   The Young Turks                 1   2013-11-13 12:53:31 9ab8f5607183ed258f4f98bb80f947b4    35afc4001e1a50fb463dac32de1d19e7';

$data = str_getcsv($data,"\t",NULL);

echo '<pre>'.print_r($data,TRUE).'</pre>';

特别注意这样一个事实:一个专栏(以&#34开头;朝鲜......&#34;实际上以双引号"开头,但没有一个完成。这就是我提供NULL作为第三个参数(机箱)来覆盖defaut "机箱值的原因。

结果如下:

Array
(
[0] => 0
[1] => 16
[2] => Gruesome Public Executions In North Korea - 80 Killed
[3] => http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata
[4] => 
[5] => North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou...  1384357511  http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw   0   The Young Turks                 1   2013-11-13 12:53:31 9ab8f5607183ed258f4f98bb80f947b4    35afc4001e1a50fb463dac32de1d19e7
)

正如你所看到的那样,报价正在打破这个功能。从逻辑上讲,我以为我可以使用NULL或空字符串''作为str_getcsv(附件)的第三个参数,但都不起作用?!?!

我唯一可以使str_getcsv正常工作的是空格char ' '。这对我没有任何意义,因为没有任何列有空格开始和/或结束它们。

$data = '0  16  Gruesome Public Executions In North Korea - 80 Killed       http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata        "North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou...    1384357511  http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw   0   The Young Turks                 1   2013-11-13 12:53:31 9ab8f5607183ed258f4f98bb80f947b4    35afc4001e1a50fb463dac32de1d19e7';

$data = str_getcsv($data,"\t",' ');

echo '<pre>'.print_r($data,TRUE).'</pre>';

现在的结果是:

Array
(
[0] => 0
[1] => 16
[2] => Gruesome Public Executions In North Korea - 80 Killed
[3] => http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata
[4] => 
[5] => "North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou...
[6] => 1384357511
[7] => http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw
[8] => 0
[9] => The Young Turks
[10] => 
[11] => 
[12] => 
[13] => 
[14] => 1
[15] => 2013-11-13 12:53:31
[16] => 9ab8f5607183ed258f4f98bb80f947b4
[17] => 35afc4001e1a50fb463dac32de1d19e7
)

所以我的问题是,为什么它可以作为机箱使用空格,但不能使用NULL或空字符串?还会对此产生影响吗?

更新1:这似乎减少了我在日志中收到的错误数量,但它没有消除它们,所以我猜测我用作机箱的已经引起了意想不到的副作用,尽管不像以前的问题那么麻烦。但我的问题仍然是一样的,为什么我不能使用NULL,或者一个空的空间作为封闭空间,其次,是否有更好的方法来处理/执行此操作?

2 个答案:

答案 0 :(得分:3)

只是给出一个起点...

你可能想考虑使用字符串本身,而不是在你的情况下使用像str_getcsv这样的函数。

但请注意,如果您选择此路线(至少可能是您唯一的选择),至少会有一些陷阱:

  • 处理转义字符
  • 数据中的换行符(不是分隔符)

如果您知道除了字段结尾之外,您的字符串中没有其他TABS,并且除了划分行之外没有其他任何换行符,您可能会没关系:

$data = explode("\n", $the_whole_csv_string_block);

foreach ($data as $line)
{
    $arr = explode("\t", $line);

    // $arr[0] will have every first field of every row, $arr[1] the 2nd, ...
    // Usually this is what I want when working with a csv file

    // But if you rather want a multidimensional array, you can simply add 
    // $arr to a different array and after this loop you are good to go.
}

否则这只是你的起点,开始并根据你的个人情况进行调整,希望它有所帮助。

答案 1 :(得分:1)

只需使用chr(0)作为附件并转义:

$data = str_getcsv($data, "\t", chr(0), chr(0));