我正在使用fgetcsv
解析CSV文件。我从Magento安装中获得了CSV导出。但是,它无法解析。这是一个有问题的出口线:
200000,三星Galaxy S2,$ 399.00,8806085359376,null,免费地面运费,新的,有货,三星,“Vivid‧Fast‧Slim新款GALAXY SII Plus让您的生活更加智能!4.3”SUPER AMOLED Plus 4.3 “SUPER AMOLED Plus显示器超越了已经非常出色的SUPER AMOLED,提供增强的可读性,更纤薄的设计和更好的电池消耗,以获得任何智能手机的最佳观看价值。全触摸显示屏尺寸:4.3”分辨率:480 x 800pixel平台操作平台:Android v4.1(Jelly Bean)TOUCHWiZ v4.0用户界面(最多7页小工具桌面)Band ^ UMTS(850/900/1900 / 2100MHz)+电池容量:1650mAh“,手机>制造商>三星,
问题是文件中"
的使用是inch
和其他场合的简写。
我正在寻找preg_replace
每个双引号的RegEx,后面没有逗号或后跟逗号。但是,我的RegEx知识很糟糕,我无法创建工作表达式。
这是我认为非常接近解决方案,但我不能使它工作:
private static function _fixQuotesInString($string)
{
return preg_replace('/(?<!,)"|"(?!,)/', '"', $string);
}
由于我的知识有限,会读它,我会说:如果你找到一个双引号,检查它是否前面没有逗号,后面跟一个逗号,如果是,请将其替换为“ 。 但是,经验表明它没有。
当您发布解决方案时,如果您可以添加RegEx的“口头说明”,那就太好了,所以我可以抓住它。
答案 0 :(得分:4)
您的正则表达式将替换,"
和",
,因为它们不能同时满足两个交替条件。相反,您可以使用(?<!,)"(?!,)
,它要求引号在 旁边用逗号包围。
请注意,如果"
合法地使用逗号,则解决方案仍然存在潜在问题,因此您应该考虑在源头修复此问题。
答案 1 :(得分:3)
如果您只想解析每个逗号分隔的字段,这些字段可能包含或不包含双引号,您可以使用此正则表达式:
(?:^|,)("?)(.*?)\1(?=,(?!\s)|$)
为每个逗号分隔值分配第2组。如果通过引号打开该值,则需要关闭引号,后跟,
后跟空格或行尾,以关闭字符串。
<?php
$sourcestring="your source string";
preg_match_all('/(?:^|,)("?)(.*?)\1(?=,|$)/ims',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
$matches Array:
(
[0] => Array
(
[0] => 200000
[1] => ,Samsung Galaxy S2
[2] => ,$399.00
[3] => ,8806085359376
[4] => ,null
[5] => ,Free ground shipping
[6] => ,New
[7] => ,In Stock
[8] => ,Samsung
[9] => ,"Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3" SUPER AMOLED Plus The 4.3" SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3" Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh"
[10] => ,Mobile > Manufacturer > Samsung
[11] => ,
)
[1] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] => "
[10] =>
[11] =>
)
[2] => Array
(
[0] => 200000
[1] => Samsung Galaxy S2
[2] => $399.00
[3] => 8806085359376
[4] => null
[5] => Free ground shipping
[6] => New
[7] => In Stock
[8] => Samsung
[9] => Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3" SUPER AMOLED Plus The 4.3" SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3" Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh
[10] => Mobile > Manufacturer > Samsung
[11] =>
)
)
由于您的源文本以逗号分隔,并且逗号分隔符没有任何周围空间来解决"excellent occasion, 4.3", samsung"
的问题,您可以使用
正则表达式:(?<!,)(")(?!,\S)
无替换
<?php
$sourcestring="your source string";
echo preg_replace('/(?<!,)(")(?!,\S)/ims','',$sourcestring);
?>
$sourcestring after replacement:
200000,Samsung Galaxy S2,$399.00,8806085359376,null,Free ground shipping,New,In Stock,Samsung,"Vivid‧Fast‧Slim The new GALAXY SII Plus makes your life even smarter! 4.3 SUPER AMOLED Plus The 4.3 SUPER AMOLED Plus display goes a step beyond the already remarkable SUPER AMOLED to provide enhanced readability, a slimmer design, and better battery consumption for the best viewing value of any smartphone. Full-Touch Display Size: 4.3 Resolution: 480 x 800pixel Platform Operation Platform: Android v4.1 (Jelly Bean) TOUCHWiZ v4.0 User Interface (upto 7 pages widget desktop) Band^ UMTS(850 / 900 / 1900 / 2100MHz)+ Battery Capacity: 1650mAh",Mobile > Manufacturer > Samsung,