确定我尝试使用 PHP Simple HTML DOM Parser 从此HTML表格构建xml Feed。
<table>
<tr><td colspan="5"><strong>Saturday October 15 2011</strong></td></tr>
<tr><td>Team 1</td> <td>vs</td> <td>Team 7</td> <td>3:00 pm</td></tr>
<tr><td>Team 2</td> <td>vs</td> <td>Team 12</td> <td>3:00 pm</td></tr>
<tr><td>Team 3</td> <td>vs</td> <td>Team 8</td> <td>3:00 pm</td></tr>
<tr><td>Team 4</td> <td>vs</td> <td>Team 10</td> <td>3:00 pm</td></tr>
<tr><td>Team 5</td> <td>vs</td> <td>Team 11</td> <td>3:00 pm</td></tr>
<tr><td colspan="5"><strong>Monday October 17 2011</strong></td></tr>
<tr><td>Team 6</td> <td>vs</td> <td>Team 9</td> <td>7:45 pm</td></tr>
<tr><td colspan="5"><strong>Saturday October 22 2011</strong></td></tr>
<tr><td>Team 7</td> <td>vs</td> <td>Team 12</td> <td>3:00 pm</td></tr>
<tr><td>Team 1</td> <td>vs</td> <td>Team 2</td> <td>3:00 pm</td></tr>
<tr><td>Team 8</td> <td>vs</td> <td>Team 4</td> <td>3:00 pm</td></tr>
<tr><td>Team 3</td> <td>vs</td> <td>Team 6</td> <td>3:00 pm</td></tr>
<tr><td>Team 9</td> <td>vs</td> <td>Team 5</td> <td>3:00 pm</td></td></tr>
<tr><td>Team 10</td> <td>vs</td> <td>Team 11</td> <td>3:00 pm</td></tr>
</table>
我的目标是提取日期,然后提取以下行直到下一个日期。这样我就可以为每个日期构建一个XML节点。
<matchday date="Saturday October 15 2011">
<fixture>
<hometeam>Team 1</hometeam>
<awayteam>Team 7</awayteam>
<kickoff>3:00 pm</kickoff>
</fixture>
<fixture>
<hometeam>Team 2</hometeam>
<awayteam>Team 12</awayteam>
<kickoff>3:00 pm</kickoff>
</fixture>
</matchday>
我目前拥有html中的每个日期并构建了各自的xml节点
$dateNodes = $html->find('table tr td[colspan="5"] strong');
foreach($dateNodes as $date){
echo '<matchday day="'.trim($date->innertext).'">';
// FIXTURES
// END FIXTURES
echo '</matchday>';
}
我将如何获得每个灯具的团队名称等,直到下一个比赛日为止?
答案 0 :(得分:2)
相反,如果SimpleHtmlDom (which I believe is a craptaculous library),您可以使用XSLT transformation和PHP's native XSLT processor:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml"/>
<xsl:template match="/">
<matchdays>
<xsl:for-each select="table/tr[td[@colspan=5]]">
<matchday>
<xsl:attribute name="date">
<xsl:value-of select="td/strong"/>
</xsl:attribute>
<xsl:for-each select="following-sibling::tr[
not(td[@colspan]) and
preceding-sibling::tr[td[@colspan]][1] = current()
]">
<fixture>
<hometeam><xsl:value-of select="td[1]"/></hometeam>
<awayteam><xsl:value-of select="td[3]"/></awayteam>
<kickoff><xsl:value-of select="td[4]"/></kickoff>
</fixture>
</xsl:for-each>
</matchday>
</xsl:for-each>
</matchdays>
</xsl:template>
</xsl:stylesheet>
然后只使用http://php.net/manual/en/xsltprocessor.transformtoxml.php示例中给出的代码将HTML转换为XML:
$xml = new DOMDocument;
$xml->load('YourSourceFile.xml');
$xsl = new DOMDocument;
$xsl->load('YourStyleSheet.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
echo $proc->transformToXML($xml);
除了使用XSLT之外,您还可以使用PHP的本机DOM扩展:
$xml = new DOMDocument;
$xml->loadHtmlFile('YourHtmlFile.xml');
$xp = new DOMXPath($xml);
$new = new DOMDocument('1,0', 'utf-8');
$new->appendChild($new->createElement('matchdays'));
foreach ($xp->query('//table/tr/td[@colspan=5]/strong') as $gameDate) {
$matchDay = $new->createElement('matchday');
$matchDay->setAttribute('date', $gameDate->nodeValue);
foreach ($xp->query(
sprintf(
'//tr[
not(td[@colspan]) and
preceding-sibling::tr[td[@colspan]][1]/td/strong/text() = "%s"
]',
$gameDate->nodeValue
)
) as $gameData) {
$tds = $gameData->getElementsByTagName('td');
$fixture = $matchDay->appendChild($new->createElement('fixture'));
$fixture->appendChild($new->createElement(
'hometeam', $tds->item(0)->nodeValue)
);
$fixture->appendChild($new->createElement(
'awayteam', $tds->item(2)->nodeValue)
);
$fixture->appendChild($new->createElement(
'kickoff', $tds->item(3)->nodeValue)
);
}
$new->documentElement->appendChild($matchDay);
}
$new->formatOutput = true;
echo $new->saveXML();