我有以下HTML:
<table id="metadata_content_table"><tr class="metadata_row"><td
class="metadata_label">Title</td><td class="metadata_value"><span dir=ltr>PHP
Reference: Beginner to Intermediate PHP5</span></td></tr><tr
class="metadata_row"><td class="metadata_label"><span
dir=ltr>Author</span></td><td class="metadata_value"><a class="primary"
href="https://www.google.com/search?tbo=p&tbm=bks&q=inauthor:%22Mario+Lurig%22&source=gbs_metadata_r&cad=6"><span
dir=ltr>Mario Lurig</span></a></td></tr><tr class="metadata_row"><td
class="metadata_label"><span dir=ltr>Publisher</span></td><td
class="metadata_value"><span dir=ltr>Mario Lurig, 2008</span></td></tr><tr
class="metadata_row"><td class="metadata_label"><span
dir=ltr>ISBN</span></td><td class="metadata_value"><span dir=ltr>143571590X,
9781435715905</span></td></tr><tr class="metadata_row"><td
class="metadata_label"><span dir=ltr>Length</span></td><td
class="metadata_value"><span dir=ltr>164 pages</span></td></tr><tr
class="metadata_row"><td class="metadata_label"><span
dir=ltr>Subjects</span></td><td class="metadata_value"><div
style="display:inline" itemscope
itemtype="http://data-vocabulary.org/Breadcrumb"><a class="primary"
href="https://www.google.com/search?tbo=p&tbm=bks&q=subject:%22Computers%22"
itemprop="url" dir=ltr><span
itemprop="title">Computers</span></a></div> › <div
style="display:inline" itemscope
itemtype="http://data-vocabulary.org/Breadcrumb"><a class="primary"
href="https://www.google.com/search?tbo=p&tbm=bks&q=subject:%22Computers+Programming%22"
itemprop="url" dir=ltr><span
itemprop="title">Programming</span></a></div> › <div
style="display:inline" itemscope
itemtype="http://data-vocabulary.org/Breadcrumb"><a class="primary"
href="https://www.google.com/search?tbo=p&tbm=bks&q=subject:%22Computers+Programming+Object+Oriented%22"
itemprop="url" dir=ltr><span itemprop="title">Object
Oriented</span></a></div><br><br><a class="primary"
href="https://www.google.com/search?tbo=p&tbm=bks&q=subject:%22Computers+/+General%22&source=gbs_metadata_r&cad=6"><span
dir=ltr>Computers / General</span></a><br/><a class="primary"
href="https://www.google.com/search?tbo=p&tbm=bks&q=subject:%22Computers+/+Programming+/+Object+Oriented%22&source=gbs_metadata_r&cad=6"><span
dir=ltr>Computers / Programming / Object Oriented</span></a></td></tr><tr
class="metadata_row"><td> </td><td> </td></tr><tr
class="metadata_row"><td class="metadata_label"><span dir=ltr>Export
Citation</span></td><td class="metadata_value"><a class="gb-button "
href="https://books.google.com/books/download/PHP_Reference_Beginner_to_Intermediate_P.bibtex?id=noi76uKOJ5wC&output=bibtex"><span
dir=ltr>BiBTeX</span></a> <a class="gb-button "
href="https://books.google.com/books/download/PHP_Reference_Beginner_to_Intermediate_P.enw?id=noi76uKOJ5wC&output=enw"><span
dir=ltr>EndNote</span></a> <a class="gb-button "
href="https://books.google.com/books/download/PHP_Reference_Beginner_to_Intermediate_P.ris?id=noi76uKOJ5wC&output=ris"><span
dir=ltr>RefMan</span></a></td></tr></table>
如您所见,HTML不是格式良好的XML。例如。 span元素的属性值不在引号中。
当我尝试解析上面的XML时: $ ob = simplexml_load_string($ xml); 该函数不起作用(返回FALSE)。
我的目标是将HTML转换为关联数组。使用XML,以下代码可以工作:
$ob= simplexml_load_string($xml);
$json = json_encode($xml);
$configData = json_decode($json, true);
如何使用格式不正确的HTML实现相同的结果?