使用PHP处理xml文件的更快方法

时间:2015-04-14 08:08:47

标签: php xml iteration

我有这个名为flight-itinerary.xml的xml文件。缩小版如下所示。

<itin line="1" dep="LOS" arr="ABV">
    <flt>
        <fltav>
            <cb>1</cb>
            <id>C</id>
            <av>10</av>
            <cur>NGN</cur>
            <CurInf>2,0.01,0.01</CurInf>
            <pri>15000.00</pri>
            <tax>30800.00</tax>
            <fav>1</fav>
            <miles></miles>
            <fid>11</fid>
            <finf>0,0,1</finf>

            <cb>2</cb>
            <id>J</id>
            <av>10</av>
            <cur>NGN</cur>
            <CurInf>2,0.01,0.01</CurInf>
            <pri>13000.00</pri>
            <tax>26110.00</tax>
            <fav>1</fav>
            <miles></miles>
            <fid>12</fid>
            <finf>0,0,0</finf>
        </fltav>
    </flt>
</itin>

完整档案包含8个行程<itin>元素。每个<fltav>元素的<itin>元素包含<cb>1</cb><finf>0,0,1</finf>组中的11个。

以下是我用来处理文件的代码:

<?php

function processFlightsData()
{
    $data = array();
    $dom= new DOMDocument();
    $dom->load('flight-itinerary.xml');

    $classbands  = $dom->getElementsByTagName('classbands')->item(0);
    $bands       = $classbands->getElementsByTagName('band');
    $itineraries = $dom->getElementsByTagName('itin');
    $counter     = 0;

    foreach($itineraries AS $itinerary)
    { 
        $flt = $itinerary->getElementsByTagName('flt')->item(0);

        $dep = $flt->getElementsByTagName('dep')->item(0)->nodeValue;
        $arr = $flt->getElementsByTagName('arr')->item(0)->nodeValue;

        $time_data       = $flt->getElementsByTagName('time')->item(0);
        $departure_day   = $time_data->getElementsByTagName('ddaylcl')->item(0)->nodeValue;
        $departure_time  = $time_data->getElementsByTagName('dtimlcl')->item(0)->nodeValue;
        $departure_date  = $departure_day. ' '. $departure_time;
        $arrival_day     = $time_data->getElementsByTagName('adaylcl')->item(0)->nodeValue;
        $arrival_time    = $time_data->getElementsByTagName('atimlcl')->item(0)->nodeValue;
        $arrival_date    = $arrival_day. ' '. $arrival_time;
        $flight_duration = $time_data->getElementsByTagName('duration')->item(0)->nodeValue;

        $flt_det       = $flt->getElementsByTagName('fltdet')->item(0);
        $airline_id    = $flt_det->getElementsByTagName('airid')->item(0)->nodeValue;
        $flt_no        = $flt_det->getElementsByTagName('fltno')->item(0)->nodeValue;
        $flight_number = $airline_id. $flt_no;
        $airline_type  = $flt_det->getElementsByTagName('eqp')->item(0)->nodeValue;
        $stops         = $flt_det->getElementsByTagName('stp')->item(0)->nodeValue;

        $av_data = $flt->getElementsByTagName('fltav')->item(0);

        $cbs     = iterator_to_array($av_data->getElementsByTagName('cb')); //11 entries
        $ids     = iterator_to_array($av_data->getElementsByTagName('id')); //ditto
        $seats   = iterator_to_array($av_data->getElementsByTagName('av')); //ditto
        $curr    = iterator_to_array($av_data->getElementsByTagName('cur')); //ditto
        $price   = iterator_to_array($av_data->getElementsByTagName('pri')); //ditto
        $tax     = iterator_to_array($av_data->getElementsByTagName('tax')); //ditto
        $miles   = iterator_to_array($av_data->getElementsByTagName('miles')); //ditto
        $fid     = iterator_to_array($av_data->getElementsByTagName('fid')); //ditto    

        $inner_counter = 0;

        for($i = 0; $i < count($ids); $i++)
        {
            $data[$counter][$inner_counter] = array
            (
                'flight_number'                   => $flight_number,
                'flight_duration'                 => $flight_duration, 
                'departure_date'                  => $departure_date,
                'departure_time'                  => substr($departure_time, 0, 5),
                'arrival_date'                    => $arrival_date,
                'arrival_time'                    => substr($arrival_time, 0, 5),
                'departure_airport_code'          => $dep,
                'departure_airport_location_name' => get_airport_data($dep, $data_key='location'),
                'arrival_airport_code'            => $arr,
                'arrival_airport_location_name'   => get_airport_data($arr, $data_key='location'),
                'stops'                           => $stops,
                'cabin_class'                     => $ids[$i]->nodeValue,
                'ticket_class'                    => $ids[$i]->nodeValue,
                'ticket_class_nicename'           => formate_ticket_class_name($ids[$i]->nodeValue),
                'available_seats'                 => $seats[$i]->nodeValue,
                'currency'                        => $curr[$i]->nodeValue,
                'price'                           => $price[$i]->nodeValue,
                'tax'                             => $tax[$i]->nodeValue,
                'miles'                           => $miles[$i]->nodeValue,
            );

            ++$inner_counter;
        }

    return $data;
}

?>

现在,外部循环为每个<itin>元素迭代8次,并且在外部循环的每次迭代期间,内部循环迭代11次,每次传递总共88次迭代并导致严重的性能问题。我正在寻找的是一种更快的处理文件的方法。任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:0)

我不认为循环是瓶颈。您应该检查在循环中调用的操作get_airport_dataformate_ticket_class_name

在许多itin元素上尝试代码(没有辅助操作)只需不到一秒钟,请查看这个小提琴:http://phpfiddle.org/main/code/7fpi-b3ka(请注意,XML可能与您的不相似,我和#39;我猜测了许多缺失的元素。)

如果有被调用的操作会大大增加处理时间,请尝试使用批量数据调用操作或缓存响应。