我想将表格中的文本从html文件转换为字符串并将它们添加到数组中但是...
$doc=new DOMDocument();
$doc->loadHTMLFile('table.html');
$table=$doc->getElementsByTagName('table');
$s=$table->item(0)->nodeValue;
echo $s; // it's ok , i got string .
$arr=explode(' ', $s); //i add string to array but..
echo "<br>";
echo count($arr); //why this string when explode to array has 1917 element??
echo "<pre>";
print_r($arr); // and it has many space element ??
echo "</pre>";
如何删除数组中元素之间的空格?有没有其他方法可以做到这一点? 我想要字符串中的数组编号,如下所示: $ arr [0] = 1.85, $的常用3 [1] = 1.84, $ arr [2] = 1.75, ..... ...
这是table.html文件: https://app.box.com/s/1rwuk6daujgkxrwg4z4b
答案 0 :(得分:1)
如果您只需要第一张表中的锚标记值,请尝试以下方法:
$doc=new DOMDocument();
$doc->loadHTMLFile('s.html');
$table=$doc->getElementsByTagName('table');
$tableDom = $table->item(0);
foreach($tableDom->getElementsByTagName('a') as $t)
{
if(is_numeric($t->nodeValue))
{
$result[]= $t->nodeValue;
}
}
print_r($result);
输出:
Array ( [0] => 1.85 [1] => 1.84 [2] => 1.75 [3] => 1.74 [4] => 2.05 [5] => 2.09 [6] => 2.21 [7] => 2.25 )
<强>选项2 如果您需要字符串中的所有数值,请尝试以下操作:
$doc=new DOMDocument();
$doc->loadHTMLFile('table.html');
$table=$doc->getElementsByTagName('table');
$s=$table->item(0)->nodeValue;
$arr[]=$s; //i add string to array but..
preg_match_all('/(([\+|\-]{1})?\d(.{1})?)+/', $arr[0], $matches);
echo "<pre>";
print_r( $matches[0]);
echo "</pre>";
<强>输出:
Array ( [0] => -0.25 [1] => 1.85 [2] => 1.84 [3] => 1.75 [4] => 1.74 [5] => 2.05 [6] => 2.09 [7] => 2.21 [8] => 2.25 )
答案 1 :(得分:0)
实际上你做错了,你需要遍历锚标签并丢弃非数字字符....
<?php
$doc=new DOMDocument();
$doc->loadHTMLFile('table.html');
foreach($doc->getElementsByTagName('a') as $tag)
{
if(is_numeric($tag->nodeValue))
{
$arr[]= $tag->nodeValue;
}
}
echo "<pre>";
print_r($arr);
<强>输出:
Array
(
[0] => 1.85
[1] => 1.84
[2] => 1.75
[3] => 1.74
[4] => 2.05
[5] => 2.09
[6] => 2.21
[7] => 2.25
)