Question

在浏览器中，以下网址http://kulturarvsdata.se/raa/fmi/xml/10028201230001显示为常规XML文件。但是当我使用

时

file_get_contents('http://kulturarvsdata.se/raa/fmi/xml/10028201230001');

它会删除所有XML标记，只返回包含的文本。为什么会发生这种情况，我该如何避免呢？

一些回复标题：

array(5) { 
    [0]=> string(15) "HTTP/1.1 200 OK" 
    [1]=> string(35) "Date: Thu, 01 Jan 2015 20:07:04 GMT" 
    [2]=> string(25) "Server: Apache-Coyote/1.1" 
    [3]=> string(43) "Content-Type: application/xml;charset=UTF-8" 
    [4]=> string(17) "Connection: close" 
}

Answer 1

无法重现：

<?php
/**
 * http://stackoverflow.com/questions/27733997/file-get-contents-removes-xml-tags
 */

header('Content-Type: text/plain; charset=utf-8');
echo substr(file_get_contents('http://kulturarvsdata.se/raa/fmi/xml/10028201230001'), 0, 256);

<?xml version="1.0" encoding="UTF-8"?><pres:item xmlns:pres="http://kulturarvsdata.se/presentation#"><pres:id>10028201230001</pres:id><pres:entityUri>http://kulturarvsdata.se/raa/fmi/10028201230001</pres:entityUri><pres:type>Kulturlämning</pres:type><pres

答案是：工作正常。

您可能会在浏览器中查看删除标记的响应？这至少是Stackoverflow上一些用户询问的常见错误。

Answer 2

请试试这个

simplexml_load_file - 将XML文件解释为对象

<?php
// The file test.xml contains an XML document with a root element
// and at least an element /[root]/title.

if (file_exists('test.xml')) {
    $xml = simplexml_load_file('test.xml');

    print_r($xml);
} else {
    exit('Failed to open test.xml.');
}
?>

Answer 3

有同样的问题，我解决了它改变了获取信息的方式，所以我没有使用file_get_contents，而是使用了curl：

$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_REFERER, 'http://www.google.com/');
curl_setopt ($ch, CURLOPT_TIMEOUT, 10);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
$result = curl_exec ($ch);
curl_close ($ch);

file_get_contents（）删除XML标记

3 个答案: