如何将此类型的XML转换为CSV

时间:2012-09-23 11:29:09

标签: php xml csv xml-parsing

我有这种类型的XML文件

文件部分示例:

<!-- language: lang-xml -->

<ponudba podjetje="SO d.o.o." velja_od="23.09.2012 @ 12:30:48">
    <artikel koda="LS593EAR" naziv="HP ENVY 17-2199e" kategorija="Prenosniki" podkategorija="Hewlett Packard (HP)" v_akciji="ne" kosovnost="več">
    <opis>
    HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)
    </opis>
    <opis_detail>
    HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)<br/><table> <col width="25%" /> <col /> <tbody> <tr> <th>Procesor</th> <td>Intel® Core™ i7-2630QM / 2.00 GHz / Quad-Core</td> </tr> <tr> <th>Delovni pomnilnik</th> <td>8 GB DDR3</td> </tr> <tr> <th>Trdi disk</th> <td>1 TB (1000 GB) / 5400 / SATA</td> </tr> <tr> <th>LCD zaslon</th> <td>43,9 cm (17,3'') Full HD HP Ultra BrightView Infinity Display (1920x1080)</td> </tr> <tr> <th>Grafična kartica</th> <td>AMD Radeon™ HD 6850 Graphics</td> </tr> <tr> <th>Optična enota</th> <td>SuperMulti DVD-RW Double Layer</td> </tr> <tr> <th>USB 2.0</th> <td>2x</td> </tr> <tr> <th>USB 3.0</th> <td>1x</td> </tr>    <tr> <th>eSATA</th> <td>da</td> </tr> <tr> <th>HDMI</th> <td>da</td> </tr> <tr> <th>WiFi</th> <td>da</td> </tr> <tr> <th>Bluetooth</th> <td>da</td> </tr> <tr> <th>WWAN</th> <td>ne</td> </tr> <tr> <th>Spletna kamera</th> <td>da</td> </tr> <tr> <th>Card Reader</th> <td>da</td> </tr> <tr> <th>Express Card</th> <td>ne</td> </tr> <tr> <th>TV kartica</th> <td>ne</td> </tr> <tr> <th>Finger Print</th> <td>ne</td> </tr> <tr> <th>Vhodne naprave</th> <td>brez</td> </tr>     <tr> <th>Operacijski sistem</th> <td>Microsoft Windows 7 Home Premium (64 bit)</td> </tr> <tr> <th>Država uvoza</th> <td>Italijanska tipkovnica (priložene SLO nalepke)</td> </tr>  <tr> <th>Stanje modela</th> <td>HP Renew</td> </tr>     </tbody> </table>
    </opis_detail>
    <garancija_v_mesecih>12</garancija_v_mesecih>
    <cena_v_EUR>1.049,00</cena_v_EUR>
    <proizvajalec>HP</proizvajalec>
    <stanje>na zalogi</stanje>
    <url_foto_artikla>
    http://www.so-doo.si/media/catalog/product/cache/1/image/265x/9df78eab33525d08d6e5fb8d27136e95/c/0/c02034964.jpg.hri_4.jpg
    </url_foto_artikla>
    <vec_fotk_artikla>
    <slika href="http://www.so-doo.si/media/catalog/product/c/0/c02034982.jpg.hri_4.jpg"/>
    <slika href="http://www.so-doo.si/media/catalog/product/c/0/c02034991.jpg.hri_4.jpg"/>
    </vec_fotk_artikla>
    <teza_artikla_v_kg>2.9000</teza_artikla_v_kg>
    </artikel>

现在我尝试使用以下类型的代码将此XML文件转换为CSV:

 
<?php

$filexml='so_feed.xml';
if (file_exists($filexml)) {
    $xml = simplexml_load_file($filexml);
$f = fopen('sofeed.csv', 'w');
foreach ($xml->naziv as $naziv) {
    fputcsv($f, get_object_vars($naziv),',','"');
}
fclose($f);
}
?>`

其中so_feed.xml是输入XML文件,sofeed.csv是输出CSV文件,字段&#34; naziv&#34;是我需要数据的节点。

但我只得到空的csv文件。一些帮助,请? :)当我使用&#34; artikel&#34;节点,我没有得到完整的信息,只是一个部分:(

使用&#34; artikel&#34;数组我得到所有信息,除了&#34; kode&#34;和&#34; naziv&#34;,在英语中,即&#34; SKU&#34;和&#34;产品名称&#34;所以你知道这是最重要的部分而且我无法得到它,我只能得到#34; Array&#34;而是那些数据

这就是我的输出CSV的样子:

<!-- language: lang-csv -->

Array,"HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)","HP ENVY 17-2199el, Intel Core i7-2630QM (2.0 GHz), 17.3'' FHD AG LED 3D, 8 GB DDR3 (2x 4 GB), 1 TB, BluRay, ATI Radeon HD6850 1024 MB, WiFi, Bluetooth, Webcam, 3D glasses, Microsoft Windows 7 Home Premium (64 bit)<br/><table>
        <col width=""25%"" />
        <col />
        <tbody>
            <tr>
                <th>Procesor</th>
                <td>Intel® Core™ i7-2630QM / 2.00 GHz / Quad-Core</td>
            </tr>
            <tr>
                <th>Delovni pomnilnik</th>
                <td>8 GB DDR3</td>
            </tr>
            <tr>
                <th>Trdi disk</th>
                <td>1 TB (1000 GB) / 5400 / SATA</td>
            </tr>
            <tr>
                <th>LCD zaslon</th>
                <td>43,9 cm (17,3'') Full HD HP Ultra BrightView Infinity Display (1920x1080)</td>
            </tr>
            <tr>
                <th>Grafična kartica</th>
                <td>AMD Radeon™ HD 6850 Graphics</td>
            </tr>
            <tr>
                <th>Optična enota</th>
                <td>SuperMulti DVD-RW Double Layer</td>
            </tr>
            <tr>
                <th>USB 2.0</th>
                <td>2x</td>
            </tr>
            <tr>
                <th>USB 3.0</th>
                <td>1x</td>
            </tr>           
            <tr>
                <th>eSATA</th>
                <td>da</td>
            </tr>
            <tr>
                <th>HDMI</th>
                <td>da</td>
            </tr>
            <tr>
                <th>WiFi</th>
                <td>da</td>
            </tr>
            <tr>
                <th>Bluetooth</th>
                <td>da</td>
            </tr>
            <tr>
                <th>WWAN</th>
                <td>ne</td>
            </tr>
            <tr>
                <th>Spletna kamera</th>
                <td>da</td>
            </tr>
            <tr>
                <th>Card Reader</th>
                <td>da</td>
            </tr>
            <tr>
                <th>Express Card</th>
                <td>ne</td>
            </tr>
            <tr>
                <th>TV kartica</th>
                <td>ne</td>
            </tr>
            <tr>
                <th>Finger Print</th>
                <td>ne</td>
            </tr>
            <tr>
                <th>Vhodne naprave</th>
                <td>brez</td>
            </tr>               
            <tr>
                <th>Operacijski sistem</th>
                <td>Microsoft Windows 7 Home Premium (64 bit)</td>
            </tr>
            <tr>
                <th>Država uvoza</th>
                <td>Italijanska tipkovnica (priložene SLO nalepke)</td>
            </tr>               
            <tr>
                <th>Stanje modela</th>
                <td>HP Renew</td>
            </tr>       
    </tbody>
    </table>",12,"1.049,00",HP,"na zalogi",http://www.so-doo.si/media/catalog/product/cache/1/image/265x/9df78eab33525d08d6e5fb8d27136e95/c/0/c02034964.jpg.hri_4.jpg,,2.9000

申请代码后:

<!-- language: lang-php -->

<?php
// The order here determines the order in the output CSV file
$columns = array(
    'koda',
    'naziv',
    'kategorija',
    'podkategorija',
    'v_akciji',
    'kosovnost'

);

// This will be used later on to correctly sort in the attribute values
// Note: the third paramter of "array_fill" determines what value to use
// in case a node lacks an attribute
$csv_blueprint = array_combine(
    $columns,
    array_fill(0, count($columns), '')
);

$data = array($columns);
$filexml = 'so_feed.xml';

if ( !file_exists($filexml) ) {
    // Do some error routine
} else {
    $xml = simplexml_load_file($filexml);
    $artikel = $xml->artikel;

    if ( !count($artikel) ) {
        // Stop processing 'cause there's nothing to do
    } else {
        foreach ( $artikel as $item )
        {
            // Clone the row blueprint to leave the original unspoiled
            $row = $csv_blueprint;

            $attr = $item->attributes();
            foreach ( $attr as $key => $value ) {
                $row[$key] = (string) $value;
            }
            // Append the current row to the overall output data but
            // be sure to strip off the indexes and pass a numerical array
            $data[] = array_values($row);
        }

        // The rest is up to you ... do whatever you need to :D
        var_dump($data);
    }
}
?>

我只输了几个字段,在代码中定义:

<!-- language: lang-php -->

$columns = array(
    'koda',
    'naziv',
    'kategorija',
    'podkategorija',
    'v_akciji',
    'kosovnost'

如何在csv中输出所有其他属性?

1 个答案:

答案 0 :(得分:0)

根据您的评论,这就是我想出的。我试图非常冗长,以帮助您了解正在发生的事情。

// The order here determines the order in the output CSV file
$columns = array(
    'koda',
    'naziv',
    'kategorija',
    'podkategorija',
    'v_akciji',
    'kosovnost'
);

// This will be used later on to correctly sort in the attribute values
// Note: the third paramter of "array_fill" determines what value to use
// in case a node lacks an attribute
$csv_blueprint = array_combine(
    $columns,
    array_fill(0, count($columns), '')
);

$data = array($columns);
$filexml = 'so_feed.xml';

if ( !file_exists($filexml) ) {
    // Do some error routine
} else {
    $xml = simplexml_load_file($filexml);
    $artikel = $xml->artikel;

    if ( !count($artikel) ) {
        // Stop processing 'cause there's nothing to do
    } else {
        foreach ( $artikel as $item ) {
            // Clone the row blueprint to leave the original unspoiled
            $row = $csv_blueprint;

            $attr = $item->attributes();
            foreach ( $attr as $key => $value ) {
                $row[$key] = (string) $value;
            }
            // Append the current row to the overall output data but
            // be sure to strip off the indexes and pass a numerical array
            $data[] = array_values($row);
        }

        // The rest is up to you ... do whatever you need to :D
        var_dump($data);
    }
}

修改
扩展前一代码以反映对其他列的需求。

  • 以您想要合并的方式扩展$columns数组 必要的数据(例如opis)

    $columns = array(
        'koda',
        'naziv',
        'kategorija',
        'opis',  // Arbitrarily added an extra column here
        'podkategorija',
        'v_akciji',
        'kosovnost'
    );
    
  • 迭代<ponudba/>代替artikel

    $xml = simplexml_load_file($filexml);
    //$artikel = $xml->artikel;
    $ponudbas = $xml->ponudba;
    ...
        foreach ( $ponudbas as $ponudba ) {
            // Clone the row blueprint to leave the original unspoiled
            $row = $csv_blueprint;
    
  • 添加任何artikel节点的属性,就像之前一样

  • 另外迭代添加的列(与节点对应的列)并保存它们 $row[NODE_NAME / COLUMN NAME]存储区中的值