使用UTF-8编码保存XML文件

时间:2014-12-27 16:18:25

标签: php xml encoding utf-8

我想将数据存储到UTF-8编码的XML文件中,但似乎它无法正常工作.. 这是我到目前为止......

public function createXML($file = 'store.xml', $products){
    if(strpos($file, "xml") === FALSE){
        $file .= ".xml";
    }
    $doc = new DOMDocument('1.0', 'utf-8'); 
    $doc->formatOutput = true; 
    $r = $doc->createElement( "Products" ); 
    $doc->appendChild( $r ); 

    foreach( $products as $product ) 
    {
        $b = $doc->createElement( "Product" ); 
        foreach($product as $key => $value){ 
            if($value !== "Picture"){
                $node = $doc->createElement($key); 
                $node->appendChild($doc->createTextNode((utf8_encode(trim($value))))); 
                $b->appendChild( $node );
            }else{
                $pictures = $doc->createElement("Picuters");
                foreach($value as $pic){
                    $node = $doc->createElement("Picture"); 
                    $node->appendChild($doc->createTextNode((utf8_encode(trim($pic)))));
                    $pictures->appendChild($node);
                }
                $b->appendChild($pictures);
            }
        }
        $r->appendChild( $b );

    } 
    $doc->save($file);
}

但它并没有像我想要的那样保存数据..

文件中的数据是这样的......

<?xml version="1.0" encoding="utf-8"?>
<Products>
  <Product>
    <Brand>Milla by trendyol</Brand>
    <ProductCode>Bluz</ProductCode>
    <ProductName>Güpür Detaylı Bordo</ProductName>
    <ProductURL>http://www.trendyol.com/Gupur-Detayli-Bordo-Bluz/UrunDetay/29920/8562520</ProductURL>
    <ProductStatus>Yes</ProductStatus>
    <Category>Bluz</Category>
    <Gender>Kadın</Gender>
    <OldPrice>69.99</OldPrice>
    <Unit>TL</Unit>
    <NewPrice>49.99</NewPrice>
    <Picture>http://www.trendyol.com/http://s.trendyol.com/Assets/ProductImages/29043/T00400SV6A001_1_org.jpg</Picture>
    <Tags>Güpür Detaylı Bordo, Güpür, Detaylı, Bordo, Butik,Kadin,Luks &amp; Tasarim,Ayakkabi &amp; canta,Milla by trendyol,Women</Tags>
    <EndDate>29.12.2014 22:00:00</EndDate>
  </Product>
</Products>

喜欢性别

<Gender>Kadın</Gender>

它应该像

<Gender>Kadïn</Gender>

和其他同样的东西。

请帮忙......

感谢。

1 个答案:

答案 0 :(得分:4)

确保您的输入数据尚未编码为UTF-8,因为如果是,则通过调用utf8_encode()对其进行双重编码。如果您希望遇到编码为UTF-8的字符串,并且还使用其他字符集(ISO-8859-9,我猜),那么我认为用这样的函数替换utf8_encode()会更好:

function encode_to_utf8_if_needed($string)
{
    $encoding = mb_detect_encoding($string, 'UTF-8, ISO-8859-9, ISO-8859-1');
    if ($encoding != 'UTF-8') {
        $string = mb_convert_encoding($string, 'UTF-8', $encoding);
    }
    return $string;
}

正如documentation所述,函数utf8_encode() 将ISO-8859-1字符串编码为UTF-8 。对于已经编码为UTF-8或使用不同字符集的字符串,它不会产生所需的结果。