HTML Simple Dom和Mysql插入问题

时间:2012-12-24 03:30:14

标签: php simple-html-dom

我是一名正在尝试从此source

中捕获数据的程序员

以下是我要捕捉的具体部分:

<ul class="ingredient-wrap">

            <li id="liIngredient" data-ingredientid="3914" data-grams="907.2">
                <label>
                    <span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl01$cbxIngredient" /></span>
                    <p class="fl-ing" itemprop="ingredients">
                        <span id="lblIngAmount" class="ingredient-amount">2 pounds</span>
                        <span id="lblIngName" class="ingredient-name">ground beef chuck</span>

                    </p>
                </label>
            </li>

            <li id="liIngredient" data-ingredientid="5838" data-grams="454">
                <label>
                    <span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl02$cbxIngredient" /></span>
                    <p class="fl-ing" itemprop="ingredients">
                        <span id="lblIngAmount" class="ingredient-amount">1 pound</span>
                        <span id="lblIngName" class="ingredient-name">bulk Italian sausage</span>

                    </p>
                </label>
            </li>

            <li id="liIngredient" data-ingredientid="10429" data-grams="1278">
                <label>
                    <span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl03$cbxIngredient" /></span>
                    <p class="fl-ing" itemprop="ingredients">
                        <span id="lblIngAmount" class="ingredient-amount">3 (15 ounce) cans</span>
                        <span id="lblIngName" class="ingredient-name">chili beans, drained</span>

                    </p>
                </label>
            </li>

每个li包含两组单词,例如: 3(15盎司)cans 辣椒豆,耗尽我正在尝试使用foreach循环来抓取每个li中的两组单词,然后组合并保存到数据库中。

这是我的代码:

foreach($html->find(".ingredient-wrap", 0)->children as $e){
              $ingredients = $e->plaintext;
              echo trim($ingredients);
              $hostname = 'localhost';
              $username = '********';
              $password = '*******';
              $conn = new PDO("mysql:host=$hostname;dbname=*********", $username, $password);
              $sql = ("INSERT INTO ingredients (recipe_id, ingredientname) VALUES (?, ?)");
              $q = $conn->prepare($sql);
              $q->execute(array($recipe_id,$ingredients));
          }

这个问题是,在插入数据库后,每个成分名称的值都是 ... ,即使你回显echo $ingredients."<br/>",你也会看到一个列表合并后的单词和后面的空格。

感谢您的帮助!如果您有任何疑问或需要更多澄清,我在这里回复!

2 个答案:

答案 0 :(得分:1)

获取“列表”是正常的。您正在使用->innertext来获取成分。基本上你在包含成分的html上做striptags(),只留下一些裸文。你应该分别循环遍历每个成分标签。

答案 1 :(得分:-1)

可以尝试正则表达式...

preg_match_all('/<span id="lblIngAmount" class="ingredient-amount">(.*)<\/span>\s+<span id="lblIngName" class="ingredient-name">(.*)<\/span>/', $ingredients, $matches, PREG_PATTERN_ORDER);

返回:

Array
(
[0] => Array
    (
        [0] => <span id="lblIngAmount" class="ingredient-amount">2 pounds</span>
                    <span id="lblIngName" class="ingredient-name">ground beef chuck</span>
        [1] => <span id="lblIngAmount" class="ingredient-amount">1 pound</span>
                    <span id="lblIngName" class="ingredient-name">bulk Italian sausage</span>
        [2] => <span id="lblIngAmount" class="ingredient-amount">3 (15 ounce) cans</span>
                    <span id="lblIngName" class="ingredient-name">chili beans, drained</span>
    )

[1] => Array
    (
        [0] => 2 pounds
        [1] => 1 pound
        [2] => 3 (15 ounce) cans
    )

[2] => Array
    (
        [0] => ground beef chuck
        [1] => bulk Italian sausage
        [2] => chili beans, drained
    )

)

这样:

echo $matches[2][0].": ".$matches[1][0];

会给:

ground beef chuck: 2 pounds