获取文本文件的特定句子

时间:2014-09-22 08:34:18

标签: php mysql

我有以下文本文件:

====================================================================================
INDEXNUMARTICLE: '1997'
FILE: '###\www.kkk.com\kompas-pront\0004\25\economic\index.htm' NUMSENT: '22' DOMAIN: 'economic'
====================================================================================

2. Social change is a general term which refers to:  
4. change in social structure: the nature, the social institutions.
6. When behaviour pattern changes in large numbers, and is visible and sustained, it results in a social change.

我只想获得没有编号的句子并将其保存在数据库中:

=========================================================================
= id = topic    =                      content                          =
=========================================================================
=  1 = economic = Social change is a general term which refers to:      =
                = change in social structure: the nature,               =
                = the social institutions. When behaviour pattern       =
                = changes in large numbers, and is visible and sustained, 
                = it results in a social change.                        =

CODE

function isNumber($string) {
    return preg_match('/^\\s*[0-9]/', $string) > 0;
}

$txt = "C:/Users/User/Downloads/economic.txt";
$lines  = file($txt);

foreach($lines as $line_num => $line) {
$checkFirstChar = isNumber($line);
if ($checkFirstChar !== false) { 
    $line_parts   = explode(' ', $line); 
    $line_number  = array_shift($line_parts); 

    foreach ($line_parts as $part) {
        if (empty($part)) continue; 
        $parts = array(); 
        $string = implode(' ', $parts);
        $query = mysql_query("INSERT INTO tb_file VALUES ('','economic','$string')");
    }  
}

}

我有数组的问题,插入列内容的数据是不同行中的单词。请帮我。谢谢你:))

2 个答案:

答案 0 :(得分:0)

我认为你的想法很复杂 - 试试这个简短的想法:

$txt = "C:/Users/User/Downloads/economic.txt";
$lines  = file($txt);
foreach($lines as $line_num => $line) {
    $checkFirstChar = isNumber($line);
    if ($checkFirstChar !== false) {
        //entire text line without number
        $string = substr($line,strpos($line,"")+1);
        $query = mysql_query("INSERT INTO tb_file VALUES ('','economic','$string')");
    }
}

答案 1 :(得分:0)

尝试使用正则表达式。

$regex = "/[0-9]\. /";

$txt = "C:/Users/User/Downloads/economic.txt";
$str  = file_get_contents($txt);
$index = -1;

//Find the first ocurrence of a number followed by '.' and a whitespace
if(preg_match($regex, $str, $matches, PREG_OFFSET_CAPTURE)) {
    $index = $matches[0][1];
}   

//Remove all the text before that first occurrence
$str = substr($str, $index);

//Replace all the occurrences of number followed by '. ' with ' '
$text = preg_replace($regex, " ", $str);