我正在为我的PHP应用程序收到一些结构化数据,但格式有点不可预测且难以处理。我对数据的初始格式没有发言权。我得到的是一个字符串(下面给出的样本)。
[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78],[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80],[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64],[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70],[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]
以上是5位足球运动员的数据。这就是我需要得到的:
[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78]
[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80]
[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64]
[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70]
[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]
现在,我在上面的示例中手动完成了我需要用PHP可靠地完成的工作。如您所见,每个玩家都有一组数据。为了将大字符串分成单独的玩家,我不能只用“],[”因为子字符串在每个玩家的数据中出现也是不可预测的次数。
每个玩家都有一定数量的统计数据(exact_pass,touches等),但它们并不都具有相同的统计数据。例如,玩家#1有“保存”而其他玩家没有。玩家#4有“won_contest”而其他玩家没有。没有办法知道谁将拥有哪些统计数据。这意味着我不能只计算逗号直到新玩家或类似的东西。
每个玩家在他的名字前都有一个数字,但是这个数字有一个不可预测的数字,而且无法从字符串中可能出现的其他数字中辨别出来。
我认为所有玩家经常出现的是最后一点:在最后一个封闭的括号之前总是有3个整数除以逗号。这种类型的子字符串(INT,INT,INT]
)似乎没有出现在任何其他情况下。也许这可能有用吗?
答案 0 :(得分:1)
“硬”方法是括号计数(在PHP中不太常见,在文本解析语言中更常见)...
<?php
$str = "[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78],[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80],[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64],[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70],[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]";
$line = ',';
$paren_count = 0;
$lines = array();
for($i=0; $i<strlen($str); $i++)
{
$line.= $str{$i};
if($str{$i} == '[') $paren_count++;
elseif($str{$i} == ']')
{
$paren_count--;
if($paren_count == 0)
{
$lines[] = substr($line,1);
$line = '';
}
}
}
print_r($lines);
?>
答案 1 :(得分:1)
看起来@Boundless的答案是正确的,你可以使用json_decode,但你需要对你先获得的字符串做一些事情,这看起来像是一个有效的json格式字符串。
这对我有用:
<?php
$str = "[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78],[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80],[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64],[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70],[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]";
$str = '[' . $str . ']';
$str = str_replace('\'','"', $str);
//convert string to array
$arr = json_decode($str);
//now it's a php array so you can access any value
//echo '<pre>';
//print_r( $arr );
//echo '</pre>';
echo $arr [0][1]; //prints "Victor Valdes"
?>
答案 2 :(得分:0)
尝试解析为json,然后拉出你想要的东西。假设数据以4块为单位,您可以尝试:
$arr = json_decode($str);
for($i = 0; $i < count($arr) - 3; $i += 4)
{
$arr[] = new array($arr[$i], $arr[$i + 1], $arr[$i + 2], $arr[$i + 3]);
}
答案 3 :(得分:0)
为什么不在循环中计算[
?这是一个可以帮助您入门的快速未经测试的循环。
$output = array('');
$brackets = 0;
$index = 0;
foreach (str_split($input) as $ch) {
if ($ch == '[') {
$brackets++;
}
$output[$index] .= $ch;
if ($ch == ']') {
$brackets--;
if ($brackets === 0) {
$index++;
$output[$index] = '';
}
}
}
虽然不是很优雅......
答案 4 :(得分:0)
您的字符串看起来像JSON,但它不是有效的JSON,因此json_decode()
将无效。
通过将字符串包装在一对[]
中并用双引号替换单引号,可以将您的特定情况转换为有效的JSON:
$string = str_replace("'", '"', $your_string);
var_dump(json_decode('[' . $string . ']'));
请参阅this example。
当然,最好的解决方案是确保提供有效的JSON,因为如果您的文本字符串包含例如双引号,这将很容易破解。