我正在尝试使用简单的html dom提取数据,当我将html文件或网址传递给它时它工作正常: -
$html = file_get_html('sv.html');
$foo = $html->find('p[plaintext^=EDUCATION AND TRAINING]');
$items3 = array();
foreach ($pinfo as $keypi) {
# code...
//var_dump($key);
//print_r($key);
while ( $keypi->nextSibling() ) {
if ( $keypi->nextSibling() == TRUE) {
//echo $key->nextSibling();
$keypi = $keypi->nextSibling();
$varcpi = $keypi->plaintext;
//$cnt++;
//echo "cnt=$cnt<br><br><br>";
}
if ( trim($varcpi) == "JOB APPLIED FOR" ){
break;
}
$items3[] = $varcpi;
}
}
$trimmedArray = array_map('trim', $items3);
$b = array_values(array_filter($trimmedArray));
var_dump($b);
当我在数组中传递html文件时出现问题,因为我试图传递它使用 PDFTOHTML (一个php库)从pdf转换的html数据,代码如下 $ page < / strong>包含html数据,如果我 echo $ page 它显示html准确但当我将其作为数组传递给简单的html dom 时它不起作用: -
include 'vendor/autoload.php';
$pdf = new \TonchikTm\PdfToHtml\Pdf('cv.pdf', [
'pdftohtml_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdftohtml.exe',
'pdfinfo_path' => 'C:/wamp64/www/new/poppler-0.51/bin/pdfinfo.exe'
]);
foreach ($pdf->getHtml()->getAllPages() as $page) {
$my_array[] = $page . '<br/>';
}
//$html = file_get_html('sv.html');
$pinfo = $my_array->find('p[plaintext^=PERSONAL INFORMATION]');
其中$ my_array []包含html数据并且显示准确的结果但是当我将它传递给简单的html dom时: -
$pinfo = $my_array->find('p[plaintext^=PERSONAL INFORMATION]');
而不是: -
$html = file_get_html('sv.html');
显示错误: -
( ! ) Fatal error: Call to a member function find() on array in C:\wamp64\www\new\upload.php on line 56
我尝试使用 str_get_html ,但没有任何错误,但也没有显示任何结果: -
foreach ($pdf->getHtml()->getAllPages() as $page) {
$html = str_get_html($page);
$pinfo = $html->find('p[plaintext^=PERSONAL INFORMATION]');
$items3 = array();
foreach ($pinfo as $keypi) {
# code...
//var_dump($key);
//print_r($key);
while ( $keypi->nextSibling() ) {
if ( $keypi->nextSibling() == TRUE) {
//echo $key->nextSibling();
$keypi = $keypi->nextSibling();
$varcpi = $keypi->plaintext;
//$cnt++;
//echo "cnt=$cnt<br><br><br>";
}
if ( trim($varcpi) == "JOB APPLIED FOR" ){
break;
}
// echo $varc;
$items3[] = $varcpi;
}
}
$trimmedArray = array_map('trim', $items3);
$b = array_values(array_filter($trimmedArray));
var_dump($b);
}
输出: -
C:\wamp64\www\new\upload.php:81:
array (size=0)
empty
C:\wamp64\www\new\upload.php:81:
array (size=0)
empty
如果有人帮我在这里会很棒!!!