循环遍历来自API的大量数据并导入WordPress的最有效方法?

时间:2019-06-17 21:17:21

标签: php wordpress loops memory timeout

我正在使用PHP cURL请求从API(10,000行并不断增长)中获取大量数据(足球比赛)。我以数组的形式一次抓取所有内容,然后遍历整个事情。然后检查每行(匹配项)以查看其是否具有WordPress中已经存在的对应帖子。如果是这样,则将其跳过-如果不是,则将其作为新帖子导入。

我必须增加许多PHP变量才能使其正常运行,但是由于海量数据,它变得不可持续。我想知道将其拆分成较小块的最有效方法是什么?

API确实允许您传递“ page”参数。是否可以使用page参数将调用拆分为较小的块,同时还确保PHP不会超时?单个javascript AJAX请求会更有效吗?从服务器的角度寻找最有效的方法。

private function syncMatches($event_id) {

    $get_matches = $this->API->GetMatches($event_id);

    foreach ($get_matches as $key => $match) {

        $match_id        = isset($match->MatchID) ? $match->MatchID : null;
        $match_event_id  = isset($match->EventID) ? $match->EventID : null;
        $match_date      = isset($match->EventDate) ? $match->EventDate : null;
        $match_away_team = isset($match->AwayTeamNameFull) ? $match->AwayTeamNameFull : null;
        $match_home_team = isset($match->HomeTeamNameFull) ? $match->HomeTeamNameFull : null;

        if (!$match_id || !$match_date || !$match_away_team || !$match_home_team) {
            continue;
        }

        $post_name = $match_date . ' - ' . $match_away_team . ' @ ' . $match_home_team;
        $post_id   = $this->existingRowHandler('match', 'match_id', $match_id);

        if ($post_id !== 0) {
            continue;
        }

        $post_meta = $this->createPostMeta($match);
        $insert_id = $this->insertPost($post_id, $post_name, 'match', $match_id, $post_meta);

    }

    return time();

    die();

}

1 个答案:

答案 0 :(得分:0)

我会尝试对API进行初始调用以获取项目总数。然后执行多个cURL调用以将数据放入一个或多个文件中。然后,要么从文件中获取数据,要么从多个文件中获取数据(如果要拆分数据)。

以下是创建多个文件的示例:


function get_total_count() {

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, 'https://yourapiurl/?page=1');

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $data = json_decode(curl_exec($ch));

    curl_close($ch);

    // Get the total count

    $item_total = $data->totalCount;

    $page_count = $item_total / 1000; // Use whatever number you want to break it up by

    /* 
     * Round the page count up so if we get a page count with decimals we make sure to get the 
     * final page
     */

    $total_pages = ceil($page_count);

    // Return the final page count number

    return $total_pages;


}



function create_json_files() {

    // Get the number of pages in the report
    $report_pages = get_total_count();

    // Create an array range for the page count
    $pages = range(1, $report_pages);

    // Create array for the json files that will be created
    $json_files = array();

    // For each page in the report_pages array we'll create json file with data returned from the report

    foreach ($pages as $page) {

    $ch = curl_init();

    // Get each page of the smartsheet report

    curl_setopt($ch, CURLOPT_URL, 'https://yourapiurl/?page=' . $page);


    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $data = curl_exec($ch);

    curl_close($ch);

    // Name the file we will write to so we can push it to an array

    $file_name = "game_json_" . $page . ".json";    

    // Write data to the file   
    $the_file = fopen("game_json_" . $page . ".json", "w");
    fwrite($the_file, $data);
    fclose($the_file);

    // Add the new json file to the json files array    
    array_push($json_files, $file_name);

    }

    // Return the json files array to be used for other actions

    return $json_files;

}

function do_stuff_with_your_data() {

   $data_files = create_json_files();

   foreach($data_files as $file) {

      // Do stuff with your data

   }

}