PHP使用模式读取txt文件并保留信息

时间:2016-02-23 15:31:21

标签: php mysql

对于令人困惑的标题感到抱歉,但我无法想到另一个。

我有一个这种格式的文本文件(只有几行不在上下文中):

# Google_Product_Taxonomy_Version: 2015-02-19
1 - Animals & Pet Supplies
3237 - Animals & Pet Supplies > Live Animals
2 - Animals & Pet Supplies > Pet Supplies
3 - Animals & Pet Supplies > Pet Supplies > Bird Supplies
7385 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories
499954 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Bird Baths
7386 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Food & Water Dishes
4989 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cages & Stands
4990 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Food

到目前为止,这么好。我想编写一个解析器,其中包含每个类别的所有信息。工作完成后,必须将其写入mysql-DB。

确切地说:

1 unique ID
1 Main-category 
n sub-categories

棘手的部分(对我而言)是,如何保留这些信息并将其保存在数组中,并考虑性能方面。

我的数据库必须有这样的最终输出

ID    | parent | title | 
1     |        | Animals & Pet Supplies
3232  |   1    | Live Animals
2     |   1    | Pet Supplies
3     |   2    | Bird Supplies

事实上,我必须能够重现这个" crumb"纯我的DB条目。

我开始使用我的解析器:

public function enrichTaxonomy()
{
    $aOutput = array();

    // ignore first line
    fgets($handle);

    // iterate throug it
    while (($line = fgets($handle)) !== false)
    {
        $splitted = explode("-", $line);

        // build first level
        if (strpos($splitted[1], '>') === false)
        {
            $aOutput['id'][] = trim($splitted[0]);
            $aOutput['title'][] = trim($splitted[1]);
        } else
        {
            // recursive?
            if (substr_count($splitted[1], " > ") == 1)
            {
                $splitted2ndLevel = explode(" > ", $splitted[1]);
                $aOutput['id'][] = trim($splitted[0]);
                $aOutput['title'][] = trim($splitted2ndLevel[1]);
            }
        }
    }

    echo "<pre>";
    var_dump($aOutput);
    echo "</pre>";
}

但我意识到,这不是一个非常好的方式,因为我的下一步应该是:

if (substr_count($splitted[1], " > ") == 2)
{
    $splitted3rdLevel = explode(" > ", $splitted[1]);
    $aOutput['id'][] = trim($splitted[0]);
    $aOutput['title'][] = trim($splitted3rdLevel[2]);
}

if (substr_count($splitted[1], " > ") == 3)
{
    $splitted4thLevel = explode(" > ", $splitted[1]);
    $aOutput['id'][] = trim($splitted[0]);
    $aOutput['title'][] = trim($splitted4thLevel[3]);
}

此外,当我尝试拥有一个最终数组时,这似乎非常复杂,然后我可以迭代通过将数据插入到我的数据库中。

一个重要的注意事项是,每个&#34;子类别&#34;我必须知道它的父亲&#34;所以我也可以插入&#34;父母&#34; -id。

我现在的问题: 什么是好的,短的(相关的),高效的方式来实现这个目标?

2 个答案:

答案 0 :(得分:1)

这是您想要的代码。这假设父类别必须出现在孩子之前。

<?php

$s = "# Google_Product_Taxonomy_Version: 2015-02-19
1 - Animals & Pet Supplies
3237 - Animals & Pet Supplies > Live Animals
2 - Animals & Pet Supplies > Pet Supplies
3 - Animals & Pet Supplies > Pet Supplies > Bird Supplies
7385 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories
499954 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Bird Baths
7386 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Food & Water Dishes
4989 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cages & Stands
4990 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Food";
$lines = explode("\n", $s);
$ids = [];
foreach ($lines as $line) {
    if ($line{0} == '#') continue;
    list($id, $category) = explode(' - ', $line);
    $ids[$category] = $id;
    $pos = strrpos($category, ' > ');
    if ($pos === false) {
        echo "$id has no parent\n";
    } else {
        $parentcat = substr($category, 0, $pos);
        echo "$id has parent " . $ids[$parentcat] . "\n";
    }
}

输出

1 has no parent 3237 has parent 1 2 has parent 1 3 has parent 2 7385 has parent 3 499954 has parent 7385 7386 has parent 7385 4989 has parent 3 4990 has parent 3

https://3v4l.org/Fce8Y

答案 1 :(得分:1)

当您需要再次展平它以插入数据库时​​,无需构建树结构,而是创建与db相同的结构:

$input = <<<'EOD'
1 - Animals & Pet Supplies
3237 - Animals & Pet Supplies > Live Animals
2 - Animals & Pet Supplies > Pet Supplies
3 - Animals & Pet Supplies > Pet Supplies > Bird Supplies
7385 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories
499954 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Bird Baths
7386 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cage Accessories > Bird Cage Food & Water Dishes
4989 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Cages & Stands
4990 - Animals & Pet Supplies > Pet Supplies > Bird Supplies > Bird Food
EOD;

$dbInput=[];

$lines = explode("\n", $input);
//or for a file, $lines = file('file.path', FILE_IGNORE_NEW_LINES);

foreach($lines as $line){
    if(substr($line, 0, 1) == '#') continue;

    list($id, $crumb) = explode('-', $line);
    $id = trim($id);
    $crumb_parts = array_map('trim',explode('>', $crumb));
    $title = array_pop($crumb_parts);
    $parent = array_pop($crumb_parts);
    $parent_id = isset($dbInput[$parent])? $dbInput[$parent][':id'] : null;

    $dbInput[$title] = [
        ':id'       =>  $id,
        ':parent'   =>  $parent_id,
        ':title'    =>  $title,
    ];
}
$pdo = new PDO('mysql:host=localhost;dbname=dbname','usr','pass');

$sth = $pdo->prepare("INSERT INTO tree (id, parent, title) VALUES (:id, :parent, :title)");
foreach($dbInput as &$input){
    $sth->execute($input);
}
echo 'done';

enter image description here