Question

我正在尝试从字符串中删除数字和标点符号，只在SIMPLE HTML DOM中留下字母字符，但没有成功我尝试了多种方法而无法获得它！

示例字符串：Amazing Retard（2012）＃1 输出字符串：惊人的延迟

我理解这是一个未定义的方法，我已经为此查看了多个页面，但是我正在大脑放弃如何包含该方法。任何帮助，将不胜感激。我得到的错误是

致命错误：在/ home / ** / public_html / wp-content / themes / * /中调用未定义的方法simple_html_dom_node :: preg_replace（） ***。第123行的php

代码如下：

<?php

function scraping_comic()
{
    // create HTML DOM
    $html = file_get_html('http://page-to-scrape.com');

    // get block
    foreach($html->find('li.browse_result') as $article)
    {
        // get title
        $item['title'] = trim($article->find('h4', 0)->find('span',0)->outertext);
        // get title url
        $item['title_url'] = trim($article->find('h4', 0)->find('a.grid-hidden',0)->href);
        // get image
        $item['image_url'] = trim($article->find('img.main_thumb',0)->src);
        // get details
        $item['details'] = trim($article->find('p.browse_result_description_release', 0)->plaintext);
        // get sale info
        $item['on_sale'] = trim($article->find('.browse_comics_release_dates', 0)->plaintext);
        // strip numbers and punctuations
        $item['title2'] = trim($article->find('h4',0)->find('span',0)->preg_replace("/[^A-Za-z]/","",$item['title2'], 0)->plaintext);

        $ret[] = $item;

    }

    // clean up memory
    $html->clear();
    unset($html);

    return $ret;
}
// -----------------------------------------------------------------------------


$ret = scraping_comic();

if ( ! empty($ret))
{
    $scrape = 'http://the-domain.com';


    foreach($ret as $v)
    {

        echo '<p>'.$v['title2'].'</p>';
        echo '<p><a href="'.$scrape.$v['title_url'].'">'.$v['title'].'</a></p>';
        echo '<p><img src="'.$v['image_url'].'"></p>';
        echo '<p>'.$v['details'].'</p>';
        echo '<p> '.$v['on_sale'].'</p>';
    }
}
else { echo 'Could not scrape site!'; }
?>

Answer 1

我认为这是因为这一行：

// strip numbers and punctuations
    $item['title2'] = trim($article->find('h4',0)->find('span',0)->preg_replace("/[^A-Za-z]/","",$item['title2'], 0)->plaintext);

这样写的意思是preg_replace是你的类simple_html_dom_node的一个方法，它不是它的标准php函数。

你的班级可能有类似execute_php_function（“a_php_function”，anArrayOfArguments）

所以你会写这样的东西：

// strip numbers and punctuations
    $item['title2'] = trim($article->find('h4',0)->find('span',0)->execute_php_function("preg_replace",anArrayOfArguments)->plaintext);

Answer 2

preg_replace是一个php函数，不是simple_html_dom_node类的成员。这样称呼：

$matches = preg_replace ($pattern, $replacement, mixed $subject);

http://php.net/manual/en/function.preg-replace.php

您的$pattern和replacement似乎没问题;你只需要传入$subject你想要改变的输入。

例如，这可能是您想要实现的目标：

$item['title2'] = 
  trim(preg_replace("/[^A-Za-z]/","",$article->find('h4',0)->find('span',0));

没有数字和标点符号的返回值

2 个答案: