Question

致尊敬的读者

我正在尝试从pubmed中获取的xml数据数据中检索数据。该数组如下所示：

<summa>
    <DocS>
        <Id>1</Id>
        <Item Name="PubDate" Type="Date">1999</Item>
        <Item Name="EPubDate" Type="Date"/>    //<- notice the open tag
        <Item Name="Source" Type="String">source a</Item>
        <Item Name="AuthorList" Type="List">
            <Item Name="Author" Type="String">a</Item>
            <Item Name="Author" Type="String">b</Item>
        </Item>
    </DocS>
    <DocS>
        <Id>2</Id>
        <Item Name="PubDate" Type="Date">1781</Item>
        <Item Name="EPubDate" Type="Date"/></Item> //<- notice the closed tag
        <Item Name="Source" Type="String">source a</Item>
        <Item Name="AuthorList" Type="List">
            <Item Name="Author" Type="String">a</Item>
            <Item Name="Author" Type="String">b</Item>
            <Item Name="Author" Type="String">c</Item>
            <Item Name="Author" Type="String">d</Item>
        </Item>
    </DocS>
</summa>

数组长度可变，但总是有这样的初始结构：

<summa>
    <DocS>
        <Id>1</Id>
        <Item Name="PubDate" Type="Date">1999</Item>

我特别需要的数据是

<Item Name="PubDate" Type="Date">data needed </Item>"

下面的代码是我正在尝试的，它不起作用。有人能帮助我吗？

$pmid_all=file_get_contents($url_id);

$p=simplexml_load_string($pmid_all);

$result = $p->xpath('/item');

while(list( , $node) = each($result)) {
    echo 'item: ',$node,"\n";
}

Answer 1

您正在查询根级别的项目元素（/item）。尝试使用/summa/docs/item替换您的xpath查询。

编辑：您的XML格式也不正确 <Item Name="EPubDate" Type="Date"/></Item>

删除/或</Item>。

修好后，这对我有用：

$pmid_all=file_get_contents("foo.xml");
$p=simplexml_load_string($pmid_all);
$result = $p->xpath('/summa/DocS/Item');

while(list( , $node) = each($result)) {
    echo 'item: ',$node,"\n";
}

回答下面的评论：抓住每个Item中的第一个DocS - 元素 - 元素：

$pmid_all=file_get_contents("foo.xml");

$p=simplexml_load_string($pmid_all);
$result = $p->xpath('/summa/DocS');

while(list( , $node) = each($result)) {
    $items = $node->xpath("Item");
    echo 'item: ',$items[0],"\n"; // $item[0] is the first Item found, $item[1] the 2nd, etc...
}

Answer 2

您的XML需要先清理。 Somme标签关闭了两次，一些从未关闭......你将无法解析这种格式错误的XML。

从simpleXML数组中获取数据

2 个答案: