我正在开发一个我们需要处理非常大的JSON文件的程序,所以我想使用面向流式事件的阅读器(如jsonstreamingparser),以便我们可以避免将整个结构加载到内存中一度。我关心的是对象结构,这似乎是使这项工作所必需的。 例如,假设我正在编写像Evite这样的程序来发送活动邀请,其中包含JSON结构:
{
"title": "U2 Concert",
"location": "San Jose",
"attendees": [
{"email": "foo@bar.com"},
{"email": "baz@bar.com"}
],
"date": "July 4, 2015"
}
我想要做的是有一个编程“事件”,当流遇到新的与会者时,发出邀请电子邮件。 但是,我不能这样做,因为流尚未达到事件的日期 当然,考虑到这个例子,只需将所有内容都读入内存即可 - 但我的数据集中包含“参与者”属性的复杂对象,并且可能有数万个。
另一个“解决方案”就是强制要求:你必须先把所有必需的“父”属性放在首位,但这正是我试图找到解决办法的方法。
有什么想法吗?
答案 0 :(得分:0)
这是另一棵树走路'问题。 JSON streaming parser读取源文件并开始构建'树。它通过收集'元素来实现这一点。并将它们存储在内存中。为了使我们能够处理每个条目,它会发出事件'在方便的时候。这意味着它将根据需要调用您的函数传递有用的值。
树事件的例子'是:
...
&#39>示例'随“解析器”提供的代码是一个使用Parser在内存中构建树的程序。我刚修改了这个例子,只要它有一个完整的事件,就可以调用我们的函数。存储
那么,我们如何确定一个完整的事件'?
输入文件由一个数组组成,其中每个条目都是JSON' obbject'。每个对象都包含'子条目'构成对象'。
的数据现在,当我们遍历“树”时,建立它,我们的代码将在如上所示的各个点调用。特别是在开始时#39;并且'结束'对象和数组。我们需要收集'外部对象的所有数据。
我们如何识别这个?我们记录了我们在树中的位置'随着处理的进行。我们通过跟踪嵌套的深度来做到这一点。在树上。因此,'水平'。 '开始'一个物体的巢穴'在一个级别,结束'一个对象'不需要'一级。
我们感兴趣的对象是1级'。
提供的代码: 1)跟踪'级别'当它到达位于1级'的对象的末尾时调用我们的函数。 2)从第1级'级对象的开头累积适当结构中的数据。
要求:
1)拨打“可赎回的'如果有一个完整的事件'可以处理。
假设:
处理:
源代码:
代码:index.php
<?php // https://stackoverflow.com/questions/31079129/how-to-handle-nested-objects-in-processing-a-json-stream
require_once __DIR__ .'/vendor/jsonstreamingparser/src/JsonStreamingParser/Parser.php';
require_once __DIR__ .'/vendor/jsonstreamingparser/src/JsonStreamingParser/Listener/IdleListener.php';
require_once __DIR__ .'/Q31079129Listener.php';
/**
* The input file consists of a JSON array of 'Events'.
*
* The important point is that when the file is being 'parsed' the 'listener' is
* 'walking' the tree.
*
* Therefore
* 1) Each 'Event' is at 'level 1' in the tree.
*
* Event Level Changes:
* Start: level will go from 1 => 2
* End: level will go from 2 => 1 !!!!
*
* Actions:
* The 'processEvent' function will be called when the
* 'Event Level' changes to 2 from 1.
*
*/
define('JSON_FILE', __DIR__. '/Q31079129.json');
/**
* This is called when one 'Event' is complete
*
* @param type $listener
*/
function processEvent($listener) {
echo '<pre>', '+++++++++++++++';
print_r($listener->get_event());
echo '</pre>';
}
// ----------------------------------------------------------------------
// the 'Listener'
$listener = new Q31079129Listener();
// setup the 'Event' Listener that will be called with each complete 'Event'
$listener->whenLevelAction = 'processEvent';
// process the input stream
$stream = fopen(JSON_FILE, 'r');
try {
$parser = new JsonStreamingParser_Parser($stream, $listener);
$parser->parse();
}
catch (Exception $e) {
fclose($stream);
throw $e;
}
fclose($stream);
exit;
代码:Q31079129Listener.php
<?php // // https://stackoverflow.com/questions/31079129/how-to-handle-nested-objects-in-processing-a-json-stream
/**
* This is the supplied example modified:
*
* 1) Record the current 'depth' of 'nesting' in the current object being parsed.
*/
class Q31079129Listener extends JsonStreamingParser\Listener\IdleListener {
public $whenLevelAction = null;
protected $event;
protected $prevLevel;
protected $level;
private $_stack;
private $_keys;
public function get_event() {
return $this->event;
}
public function get_prevLevel() {
return $this->prevLevel;
}
public function get_level() {
return $this->prevLevel;
}
public function start_document() {
$this->prevLevel = 0;
$this->level = 0;
$this->_stack = array();
$this->_keys = array();
// echo '<br />start of document';
}
public function end_document() {
// echo '<br />end of document';
}
public function start_object() {
$this->prevLevel = $this->level;
$this->level++;
$this->_start_complex_value('object');
}
public function end_object() {
$this->prevLevel = $this->level;
$this->level--;
$this->_end_complex_value();
}
public function start_array() {
$this->prevLevel = $this->level;
$this->level++;
$this->_start_complex_value('array');
}
public function end_array() {
$this->prevLevel = $this->level;
$this->level--;
$this->_end_complex_value();
}
public function key($key) {
$this->_keys[] = $key;
}
public function value($value) {
$this->_insert_value($value);
}
private function _start_complex_value($type) {
// We keep a stack of complex values (i.e. arrays and objects) as we build them,
// tagged with the type that they are so we know how to add new values.
$current_item = array('type' => $type, 'value' => array());
$this->_stack[] = $current_item;
}
private function _end_complex_value() {
$obj = array_pop($this->_stack);
// If the value stack is now at level 1 from level 2,
// we're done parsing the current complete event, so we can
// move the result into place so that get_event() can return it. Otherwise, we
// associate the value
// var_dump(__FILE__.__LINE__, $this->prevLevel, $this->level, $obj);
if ($this->prevLevel == 2 && $this->level == 1) {
if (!is_null($this->whenLevelAction)) {
$this->event = $obj['value'];
call_user_func($this->whenLevelAction, $this);
$this->event = null;
}
}
else {
$this->_insert_value($obj['value']);
}
}
// Inserts the given value into the top value on the stack in the appropriate way,
// based on whether that value is an array or an object.
private function _insert_value($value) {
// Grab the top item from the stack that we're currently parsing.
$current_item = array_pop($this->_stack);
// Examine the current item, and then:
// - if it's an object, associate the newly-parsed value with the most recent key
// - if it's an array, push the newly-parsed value to the array
if ($current_item['type'] === 'object') {
$current_item['value'][array_pop($this->_keys)] = $value;
} else {
$current_item['value'][] = $value;
}
// Replace the current item on the stack.
$this->_stack[] = $current_item;
}
}