我刚从这个网站上的某些人那里获得了帮助,并且能够让用户上传文本文件,并在上传过程中抓取文件并以编程方式搜索我指定的关键字。然后,该脚本计算找到该单词的次数,并将找到的整行输出到数组中。
因此,使用此代码时,示例结果会返回此信息:
$sceneINT = $sf->countKeyWord('INT', $file);
我的班级看起来像这样:
public static function countKeyWord($word, $file){
if(!$word)
return NULL;
$contents = file_get_contents($file);
#standardise those line endings
$contents = str_replace(array("\r","\r\n","\n\r","\n"),"\n",$contents);
$lines = explode("\n", $contents);
#find your result
$result = $line_num = array();
foreach($lines as $line_num => $l)
if(strpos($l, $word)) {
$result[] = $l;
$line_nums[] = $line_num;
}
echo "<pre>"; // I am echoing out the results for debugging purpuses
print_r($result);
echo "</pre>";
return count($result); //final result shown to the user will only be the count
}
这样的结果如下:
Array
(
[0] => 3 INT. MARTEY'S OFFICE - DAY 3
[1] => 4 INT. RONNEY'S OFFICE - DAY 4
[2] => 6 INT. - BREEZE'S APARTMENT - DAY 6
[3] => 9 INT. - WAREHOUSE/SOUNDSTAGE - DAY 9
[4] => 11 INT. EXAM ROOM - DAY 11
[5] => 12 INT. RAJA'S OFFICE - LATER 12
[6] => 14 INT. RAJA'S OFFICE - LATER 14
[7] => 15 INT. LARGE OPERATING ROOM - DAY 15
[8] => 16 INT. RAJA'S OFFICE - LATER 16
[9] => 17 INT. OLIVER'S CAR - DAY 17
[10] => 20 INT. - ROY THUNDER'S OFFICE - NIGHT 20
[11] => 22 A 2ND CLIP FROM "GOLDEN GATE GUNS"- INT. BASEMENT - DAY 22
[12] => 27 INT. HOUSE WIFE #3'S HOUSE - LATER 27
[13] => 29 INT. LIBRARY - DAY 29
[14] => 31 INT. COFFEE SHOP - NIGHT 31
[15] => 32 INT. WAITING AREA - DAY 32
[16] => 33 INT. CASTING OFFICE - DAY 33
[17] => 34 INT. CASTING OFFICE - DAY 34
[18] => 35 INT. WAITING AREA - DAY 35
[19] => 36 INT. WAITING AREA - LATER 36
[20] => 37 INT. MOTEL ROOM - DAY 37
[21] => INT. WAITING AREA - LATER
[22] => 39 INT. WAITING AREA - DAY 39
[23] => 42 INT. WAITING AREA - DAY 42
[24] => 43 INT. CASTING OFFICE - DAY 43
[25] => 44 INT. AUDITION ROOM - DAY 44
[26] => INT. AUDITION ROOM - DAY
[27] => 45 INT. WAITING AREA - DAY 45
[28] => 46 INT. CASTING OFFICE - DAY 46
[29] => 47 INT. AUDITION ROOM - DAY 47
[30] => 48 INT. CASTING OFFICE - DAY 48
[31] => 49 INT. WAITING AREA - DAY 49
[32] => 50 INT. AUDITION ROOM - DAY 50
[33] => 51 INT. CASTING OFFICE - DAY 51
[34] => 52 INT. WAITING AREA - DAY 52
[35] => 53 INT. CASTING OFFICE - DAY 53
[36] => 54 INT. BURGER JOINT - NIGHT 54
)
我需要将结果上传到数据库中;
以数组[0]为例,我需要为数据库准备该行,使其看起来像这样
scene: 3
int_ext: INT
scene_name: MARTEY'S OFFICE
day_night: DAY
所有这些都将在数据库中排成一行,我不知道如何处理这个问题。如何获取结果并分成我需要的内容,然后将其发送到数据库,以便在用户按下SAVE按钮时存储找到的所有项目。
答案 0 :(得分:1)
我刚刚做了一个非常快速的正则表达式。我相信你可以稍微整理一下,但它会给你一些东西。我会查找http://php.net/manual/en/function.preg-split.php
foreach($result as $line)
{
$splitLine = preg_split("/(\d{1,3})[ ]+([A-Z.]{4}) ([A-Z' ]{1,}) - ([A-Z]{3,})/", $line);
}
那应该把它拆分成一个数组。我用你阵列的第一行测试了它。有一点需要注意的是,在阵列的每一行的第一个数字之前是否有任何填充空格尚不清楚 - 所以你可能不得不玩正则表达式。
答案 1 :(得分:1)
您的代码有一些错误。
我会使用正则表达式来查找所有相关信息:
$contents = file_get_contents($file);
$pattern = "!INT\. (.*?) - (MORNING|NIGHT|DAY|LATER)!si";
preg_match_all($pattern, $contents, $matches);
echo '<pre>';
print_r($matches);
echo '</pre>';
如果你真的需要行号,你必须采取另一种方式:
使用preg_replace替换所有换行符:
//OLD: $contents = str_replace(array("\r","\r\n","\n\r","\n"),"\n",$contents);
//replaces even empty rows
$contents = preg_replace("!(\r|\n|\r\n\r\n|\r\r|\n\n)!s", "\n", $contents);
您可以使用m4rc的正则表达式来分割数据; - )