将一条线分成几个数组元素

时间:2011-02-15 07:46:33

标签: php

我刚从这个网站上的某些人那里获得了帮助,并且能够让用户上传文本文件,并在上传过程中抓取文件并以编程方式搜索我指定的关键字。然后,该脚本计算找到该单词的次数,并将找到的整行输出到数组中。

因此,使用此代码时,示例结果会返回此信息:

$sceneINT = $sf->countKeyWord('INT', $file);

我的班级看起来像这样:

public static function countKeyWord($word, $file){
    if(!$word)
        return NULL;

    $contents = file_get_contents($file);

    #standardise those line endings
    $contents = str_replace(array("\r","\r\n","\n\r","\n"),"\n",$contents);
    $lines = explode("\n", $contents);

    #find your result
    $result = $line_num = array();
    foreach($lines as $line_num => $l)
        if(strpos($l, $word)) {
            $result[] = $l;
            $line_nums[] = $line_num;
        }

    echo "<pre>"; // I am echoing out the results for debugging purpuses
    print_r($result);
    echo "</pre>";
    return count($result); //final result shown to the user will only be the count
}

这样的结果如下:

Array
(
    [0] =>   3     INT. MARTEY'S OFFICE - DAY                                      3
    [1] =>   4     INT. RONNEY'S OFFICE - DAY                                      4
    [2] =>   6     INT. - BREEZE'S APARTMENT - DAY                                 6
    [3] =>   9     INT. - WAREHOUSE/SOUNDSTAGE - DAY                               9
    [4] =>  11     INT. EXAM ROOM - DAY                                           11
    [5] =>  12     INT. RAJA'S OFFICE - LATER                                     12
    [6] =>  14     INT. RAJA'S OFFICE - LATER                                     14
    [7] =>  15     INT. LARGE OPERATING ROOM - DAY                                15
    [8] =>  16     INT. RAJA'S OFFICE - LATER                                     16
    [9] =>  17     INT. OLIVER'S CAR - DAY                                        17
    [10] =>  20     INT. - ROY THUNDER'S OFFICE - NIGHT                            20
    [11] =>  22     A 2ND CLIP FROM "GOLDEN GATE GUNS"- INT. BASEMENT - DAY        22
    [12] =>  27     INT. HOUSE WIFE #3'S HOUSE - LATER                             27
    [13] =>  29     INT.  LIBRARY - DAY                                            29
    [14] =>  31     INT.  COFFEE SHOP - NIGHT                                      31
    [15] =>  32     INT. WAITING AREA - DAY                                        32
    [16] =>  33     INT. CASTING OFFICE - DAY                                      33
    [17] =>  34     INT. CASTING OFFICE - DAY                                      34
    [18] =>  35     INT. WAITING AREA - DAY                                        35
    [19] =>  36     INT. WAITING AREA - LATER                                      36
    [20] =>  37     INT. MOTEL ROOM - DAY                                          37
    [21] =>         INT. WAITING AREA - LATER
    [22] =>  39     INT. WAITING AREA - DAY                                        39
    [23] =>  42     INT. WAITING AREA - DAY                                        42
    [24] =>  43     INT. CASTING OFFICE - DAY                                      43
    [25] =>  44     INT. AUDITION ROOM - DAY                                       44
    [26] =>         INT. AUDITION ROOM - DAY
    [27] =>  45     INT. WAITING AREA - DAY                                        45
    [28] =>  46     INT. CASTING OFFICE - DAY                                      46
    [29] =>  47     INT. AUDITION ROOM - DAY                                       47
    [30] =>  48     INT. CASTING OFFICE - DAY                                      48
    [31] =>  49     INT. WAITING AREA - DAY                                        49
    [32] =>  50     INT. AUDITION ROOM - DAY                                       50
    [33] =>  51     INT. CASTING OFFICE - DAY                                      51
    [34] =>  52     INT. WAITING AREA - DAY                                        52
    [35] =>  53     INT. CASTING OFFICE - DAY                                      53
    [36] =>  54     INT. BURGER JOINT  - NIGHT                                     54
)

我需要将结果上传到数据库中;

以数组[0]为例,我需要为数据库准备该行,使其看起来像这样

scene: 3
int_ext: INT
scene_name:    MARTEY'S OFFICE
day_night:    DAY

所有这些都将在数据库中排成一行,我不知道如何处理这个问题。如何获取结果并分成我需要的内容,然后将其发送到数据库,以便在用户按下SAVE按钮时存储找到的所有项目。

2 个答案:

答案 0 :(得分:1)

我刚刚做了一个非常快速的正则表达式。我相信你可以稍微整理一下,但它会给你一些东西。我会查找http://php.net/manual/en/function.preg-split.php

foreach($result as $line)
{
    $splitLine = preg_split("/(\d{1,3})[ ]+([A-Z.]{4}) ([A-Z' ]{1,}) - ([A-Z]{3,})/", $line);
}

那应该把它拆分成一个数组。我用你阵列的第一行测试了它。有一点需要注意的是,在阵列的每一行的第一个数字之前是否有任何填充空格尚不清楚 - 所以你可能不得不玩正则表达式。

答案 1 :(得分:1)

您的代码有一些错误。

  • 您创建数组$ line_num但使用$ line_nums添加条目
  • 您不使用$ line_nums,为什么要创建它?

我会使用正则表达式来查找所有相关信息:

$contents = file_get_contents($file);
$pattern = "!INT\. (.*?) - (MORNING|NIGHT|DAY|LATER)!si";
preg_match_all($pattern, $contents, $matches);
echo '<pre>';
print_r($matches);
echo '</pre>';

如果你真的需要行号,你必须采取另一种方式:

使用preg_replace替换所有换行符:

//OLD: $contents = str_replace(array("\r","\r\n","\n\r","\n"),"\n",$contents);
//replaces even empty rows
$contents = preg_replace("!(\r|\n|\r\n\r\n|\r\r|\n\n)!s", "\n", $contents);

您可以使用m4rc的正则表达式来分割数据; - )