代码运行php后如何继续搜索文件

时间:2013-12-10 03:32:09

标签: php search

基本上我有类似的东西

Hand #1

First row always has the same info, 
if the text matches what im looking for ill find
the keyword in the first line. Bunch of text, 
bunch more text bla bla bla

Hand #2

这是我的代码,打印出Hand#1和Hand#2之间的所有文本

$searchfor = 'myKeyword';
$file = file_get_contents($filename);

// find the location of the keyword, this keyword indicates that i want to grab this group
// of text, since each group of text starts off with Hand #x and ends immediately before the next Hand #x i search for the keyword to identify this is a valid group of text
$pos_keyword = strpos($file, $searchfor); 

// there might be a more elegant way but the Hand # value i need will always be within 60-70 characters before the keyword
$rollback = $pos_keyword-100;

// this is the start position of the text i want to grab
$start = strpos($file, "Hand #", $rollback);
// we search from the after the keyword and assign to $end
$end = strpos($file, "Hand #", $pos_keyword);


// print out the string between the start and end Hand# keywords
echo "string: " . substr($file,$start,($end-$start)) . "<br />";
echo "<br /><br /><br />";

现在文档中有数百个值,我想重复搜索直到文档结束。我试过谷歌搜索,但人们提到使用!eof($ file)可能导致循环,我无法让它工作,任何想法的功能或循环,我会用来循环遍历此代码,直到文档的结尾。< / p>

我猜测我循环,最后将$ end设置为新的$ pos_keyword,但我不确定哪种循环最好用,有什么想法?

3 个答案:

答案 0 :(得分:2)

搜索关键字然后回溯可能不是您所追求的,所以这将是我的建议;首先拆分这些部分,然后根据它们是否包含您的关键字对其进行过滤:

$text = <<<EOS
Hand #1

First row always has the same info,
if the text matches what im looking for ill find
the keyword in the first line. Bunch of text,
bunch more text bla bla bla

Hand #2

Lala alala
EOS;

$keyword = 'keyword';
$block_re = '/(^Hand #)(\d+)(.*?)(?=\1|\Z)/ms';

if (preg_match_all($block_re, $text, $matches, PREG_SET_ORDER)) {
    print_r(array_filter($matches, function($match) use ($keyword) {
        return strpos($match[3], $keyword);
    }));
}

仅返回第一个段;第二个不包含“关键字”。

答案 1 :(得分:0)

我不常说这个,但正则表达式可能是一个可行的选择......考虑以下正则表达式:

/Hand #1(.*?)Hand #2/s

/s修饰符允许.匹配新行

所以你这样做:

$file = file_get_contents($filename);
$matches = array();

preg_match('/Hand #1(.*?)Hand #2/s', $file, $matches);

print_r($matches);

现在$matches包含两个键(如果它找到你想要的) - 0索引具有整个字符串,1索引具有匹配的文本。 See this example here.

要整理并返回匹配的文字,请执行以下操作:

unset($matches[0]);
$return_text = trim($matches[1]);
  

循环

现在,我假设Hand #1 -> Hand #2对于您文件中的每个块都不同。如果是这种情况,并且在循环之前你知道它们是什么,你可以这样做:

$delimiters = array('Hand', 'Dog', 'Cat', 'Person', 'Etc');
$returns = array();

foreach($delimiters as $d) {
    $matches = array();
    preg_match('/' . $d . ' #1(.*?)' . $d . ' #2/s', $file, $matches);
    if(!empty($matches[1]))
        $returns[] = trim($matches[1]); // add to output array
}

最后,您的$returns数组将包含所有这些分隔符之间的所有匹配块。

如果您的分隔符全部 Hand #1Hand #2,则需要使用preg_match_all,这将返回包含所有匹配的块,你不需要循环(和你将取消设置的零索引)。

  

文档

  

实施例

答案 2 :(得分:0)

首先,让我尝试按照我的理解重述您的问题:

您的文件格式如下:

Hand #1
Some text with keywords like apple
Some more text
...
Last line of Block
Hand #2
Oranges are good too
This one only has 2 lines
Hand #3

等等。

您希望代码循环输入文本的所有行,并输出关键字匹配的完整代码块。

$keywords = array('apple', 'orange');

$handle = @fopen($filename, "r");

if ($handle) {
    $block = ""; //redundant, really

    //read through the file. When we hit 'Hand #', start filling up $block
    while (($line = fgets($handle, 4096)) !== false) {
        if(strpos($buffer, 'Hand #') === 0){
            foreach($keywords as $keyword){
                if(stripos($block, $keyword) !== false){
                    print "string: {$block}<br />";
                    break; //only need to match one keyword to print the block
                }
            }

            print "<br /><br /><br />";
            $block = ""; //this is the beginning of a block;
        }

        $block .= $line;
    }
    if (!feof($handle)) {
        echo "Error: unexpected fgets() fail\n";
    }

    //check the final block
    foreach($keywords as $keyword){
        if(stripos($block, $keyword) !== false){
            print "string: {$block}<br />";
            break; //only need to match one keyword to print the block
        }
    }

    fclose($handle);
}

简而言之:

  1. 一次循环一行。
  2. 如果一行以'Hand#'开头,我们应该有一个完整的文本块
  3. 根据关键字列表
  4. 查看我们的文本块
  5. 如果与至少一个关键字匹配,请将其打印出来。
  6. 资源: