我是一名正在尝试从此source
中捕获数据的程序员以下是我要捕捉的具体部分:
<ul class="ingredient-wrap">
<li id="liIngredient" data-ingredientid="3914" data-grams="907.2">
<label>
<span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl01$cbxIngredient" /></span>
<p class="fl-ing" itemprop="ingredients">
<span id="lblIngAmount" class="ingredient-amount">2 pounds</span>
<span id="lblIngName" class="ingredient-name">ground beef chuck</span>
</p>
</label>
</li>
<li id="liIngredient" data-ingredientid="5838" data-grams="454">
<label>
<span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl02$cbxIngredient" /></span>
<p class="fl-ing" itemprop="ingredients">
<span id="lblIngAmount" class="ingredient-amount">1 pound</span>
<span id="lblIngName" class="ingredient-name">bulk Italian sausage</span>
</p>
</label>
</li>
<li id="liIngredient" data-ingredientid="10429" data-grams="1278">
<label>
<span class="checkbox-formatted"><input id="cbxIngredient" type="checkbox" name="ctl00$CenterColumnPlaceHolder$recipeTest$recipe$ingredients$rptIngredientsCol1$ctl03$cbxIngredient" /></span>
<p class="fl-ing" itemprop="ingredients">
<span id="lblIngAmount" class="ingredient-amount">3 (15 ounce) cans</span>
<span id="lblIngName" class="ingredient-name">chili beans, drained</span>
</p>
</label>
</li>
每个li包含两组单词,例如: 3(15盎司)cans 和辣椒豆,耗尽我正在尝试使用foreach循环来抓取每个li中的两组单词,然后组合并保存到数据库中。
这是我的代码:
foreach($html->find(".ingredient-wrap", 0)->children as $e){
$ingredients = $e->plaintext;
echo trim($ingredients);
$hostname = 'localhost';
$username = '********';
$password = '*******';
$conn = new PDO("mysql:host=$hostname;dbname=*********", $username, $password);
$sql = ("INSERT INTO ingredients (recipe_id, ingredientname) VALUES (?, ?)");
$q = $conn->prepare($sql);
$q->execute(array($recipe_id,$ingredients));
}
这个问题是,在插入数据库后,每个成分名称的值都是 ... ,即使你回显echo $ingredients."<br/>"
,你也会看到一个列表合并后的单词和后面的空格。
感谢您的帮助!如果您有任何疑问或需要更多澄清,我在这里回复!
答案 0 :(得分:1)
获取“列表”是正常的。您正在使用->innertext
来获取成分。基本上你在包含成分的html上做striptags()
,只留下一些裸文。你应该分别循环遍历每个成分标签。
答案 1 :(得分:-1)
可以尝试正则表达式...
preg_match_all('/<span id="lblIngAmount" class="ingredient-amount">(.*)<\/span>\s+<span id="lblIngName" class="ingredient-name">(.*)<\/span>/', $ingredients, $matches, PREG_PATTERN_ORDER);
返回:
Array
(
[0] => Array
(
[0] => <span id="lblIngAmount" class="ingredient-amount">2 pounds</span>
<span id="lblIngName" class="ingredient-name">ground beef chuck</span>
[1] => <span id="lblIngAmount" class="ingredient-amount">1 pound</span>
<span id="lblIngName" class="ingredient-name">bulk Italian sausage</span>
[2] => <span id="lblIngAmount" class="ingredient-amount">3 (15 ounce) cans</span>
<span id="lblIngName" class="ingredient-name">chili beans, drained</span>
)
[1] => Array
(
[0] => 2 pounds
[1] => 1 pound
[2] => 3 (15 ounce) cans
)
[2] => Array
(
[0] => ground beef chuck
[1] => bulk Italian sausage
[2] => chili beans, drained
)
)
这样:
echo $matches[2][0].": ".$matches[1][0];
会给:
ground beef chuck: 2 pounds