Question

首先请原谅我糟糕的英语。

我正在尝试构建一个php脚本，以便从.txt文件中搜索特定单词的多个网页。

更具体：

我有一个.txt文件，其中我存储了许多网址（每个网址都在一行，所以如果我有10个网址，文件有10行），我希望脚本检查每个网址的网页内容特定的词。因此，如果在网页上找到该单词，则脚本将返回ONLINE，否则将返回DOWN。

我构建脚本但问题是它总是返回ONLINE，即使文件中的url在其网页内容中没有特定的单词。

<?php  
$allads = file("phpelist.txt");  
print("Checking urls: <br><br><br><strong>");  
for($index = 0; $index <count($allads); $index++)  
{  
$allads[$index] = ereg_replace("\n", "", $allads[$index]);  
$data = file_get_contents('$allads[$index]');  
$regex = '/save/';  
if (preg_match($regex, $data)) {  
echo "$allads[$index]</strong>...ONLINE<br><strong>";  
} else {  
echo "$allads[$index]</strong>...DOWN<br><strong>";  
}  
}  
print("</strong><br><br><br>I verified all urls from file!");  
?

Answer 1

要在特定网页中搜索给定字符串，我会使用stripos()（不区分大小写）或strpos()（区分大小写）而不是正则表达式：

if( stripos(haystack, needle) !== FALSE ) {
   //the webpage contains the word
}

一个例子：

$str = 'sky is blue';
$wordToSearchFor = 'sky';

if (strpos($str, $wordToSearchFor) !== false) {
    echo 'true';
}
else {
    echo 'Uh oh.';
}

Demo!

虽然程序化浏览网页并不是一种好习惯，除非绝对必要，否则不应该这样做。

更新：

在您正在进行的file_get_contents来电中：

$data = file_get_contents('$allads[$index]');

您正在使用单引号，并且不会替换变量值。您必须使用双引号才能file_get_contents获取实际的网址。替换为：

$data = file_get_contents("$allads[$index]");

我注意到的另一件事是您在代码中使用了已弃用的ereg_replace()函数。 See the red box?非常不鼓励依赖堕落的功能。

在完成上述所有更正后，您的代码应如下所示：

$allads = file("phpelist.txt");  
print("Checking urls: <br><br><br><strong>");  

for($index = 0; $index <count($allads); $index++)  
{  
    $allads[$index] = str_replace("\n", "", $allads[$index]);  
    $data = file_get_contents("$allads[$index]");  

    $searchTerm = 'the';  

    if (stripos($data, $searchTerm) !== false) {
        echo "$allads[$index]</strong>...ONLINE<br><strong>";  
    } 
    else 
    {  
        echo "$allads[$index]</strong>...DOWN<br><strong>";  
    }  
}  

print("</strong><br><br><br>I verified all urls from file!");  
?>

php脚本从文件中搜索特定单词的多个网页

1 个答案: