PHP - 将domdocument的nodeValue转换为字符串或数组?

时间:2014-01-10 08:06:00

标签: php

我想将表格中的文本从html文件转换为字符串并将它们添加到数组中但是...

$doc=new DOMDocument();
$doc->loadHTMLFile('table.html');
$table=$doc->getElementsByTagName('table');
$s=$table->item(0)->nodeValue;
echo $s; // it's ok , i got string .
$arr=explode(' ', $s); //i add string to array but..
echo "<br>";
echo count($arr); //why this string when explode to array has 1917 element??
echo "<pre>";
print_r($arr);  // and it has many space element ??
echo "</pre>";

如何删除数组中元素之间的空格?有没有其他方法可以做到这一点? 我想要字符串中的数组编号,如下所示:     $ arr [0] = 1.85,     $的常用3 [1] = 1.84,     $ arr [2] = 1.75, ..... ...

这是table.html文件: https://app.box.com/s/1rwuk6daujgkxrwg4z4b

2 个答案:

答案 0 :(得分:1)

如果您只需要第一张表中的锚标记值,请尝试以下方法:

$doc=new DOMDocument();
$doc->loadHTMLFile('s.html');
$table=$doc->getElementsByTagName('table');
$tableDom = $table->item(0);
foreach($tableDom->getElementsByTagName('a') as $t)
{
    if(is_numeric($t->nodeValue))
    {
        $result[]= $t->nodeValue;
    }
}
print_r($result);

输出:

Array ( [0] => 1.85 [1] => 1.84 [2] => 1.75 [3] => 1.74 [4] => 2.05 [5] => 2.09 [6] => 2.21 [7] => 2.25 ) 

<强>选项2 如果您需要字符串中的所有数值,请尝试以下操作:

$doc=new DOMDocument();
$doc->loadHTMLFile('table.html');
$table=$doc->getElementsByTagName('table');
$s=$table->item(0)->nodeValue;

$arr[]=$s; //i add string to array but..
preg_match_all('/(([\+|\-]{1})?\d(.{1})?)+/', $arr[0], $matches);
echo "<pre>";
print_r( $matches[0]);
echo "</pre>";

<强>输出:

 Array ( [0] => -0.25 [1] => 1.85 [2] => 1.84 [3] => 1.75 [4] => 1.74 [5] => 2.05 [6] => 2.09 [7] => 2.21 [8] => 2.25 ) 

答案 1 :(得分:0)

实际上你做错了,你需要遍历锚标签并丢弃非数字字符....

<?php
$doc=new DOMDocument();
$doc->loadHTMLFile('table.html');
foreach($doc->getElementsByTagName('a') as $tag)
{
    if(is_numeric($tag->nodeValue))
    {
    $arr[]= $tag->nodeValue;
    }
}

echo "<pre>";
print_r($arr);

<强>输出:

Array
(
    [0] => 1.85
    [1] => 1.84
    [2] => 1.75
    [3] => 1.74
    [4] => 2.05
    [5] => 2.09
    [6] => 2.21
    [7] => 2.25
)