Beautifulsoup在python中进行html解析非常方便,但我遇到问题是要使用干净的代码直接使用"roles" : {
"-KMohJaG6djjeBiq3oiV" : {
"creationDate" : 1468689365795,
"description" : "administrador",
"id" : "-KMohJaG6djjeBiq3oiV",
"permissions" : {
"inputs" : true,
"kardex" : true,
"outputs" : true,
"persons" : true,
"product" : false,
"rol" : true,
"sales" : true,
"supplier" : true,
"user" : true
},
"state" : true
},
}
或<html lang="en">
<head>
<title>Receipt</title>
</head>
<body>
<?php
$invoicenum = $_POST['invoicenum'];
$name = $_POST['name'];
$netrev=$invoicenum - 1;
?>
Items:
<?
$itemQuery = mysql_query("SELECT * FROM sales WHERE invoicenum = '$netrev' AND tabname = '$name'");
$result = array();
while($row = mysql_fetch_array($itemQuery))
{
$result[] = $row['itemname'];
}
echo json_encode($result);
$amounts = json_decode($result['amounts']);
$items = json_decode($result['items']);
$prices = json_decode($result['prices']);
?>
<br>
<?
for ($i = 0; $i < count($items); $i++)
{
echo $amounts[$i] . "x " . $items[$i] . " - " . $prices[$i] . "<br>";
}
?>
</body>
</html>
string
结果:
text
如何获得
的结果from bs4 import BeautifulSoup
tr ="""
<table>
<tr><td>text1</td></tr>
<tr><td>text2<div>abc</div></td></tr>
</table>
"""
table = BeautifulSoup(tr,"html.parser")
for row in table.findAll("tr"):
td = row.findAll("td")
print td[0].text
print td[0].string
我想跳过额外的内部标签
text1
text1
text2abc
None
与text1
text2
答案 0 :(得分:2)
您可以通过设置text
和.find()
参数来简单地使用recursive
功能。
for row in table.findAll("tr"):
td1 = row.td.find(text=True, recursive=False)
print str(td1)
您的输出为:
text1
text2
无论div
标记的位置如何,这都有效。请参阅下面的示例。
>>> tr ="""
<table>
<tr><td>text1</td></tr>
<tr><td>text2<div>abc</div></td></tr>
<tr><td><div>abc</div>text3</td></tr>
</table>
"""
>>> table = BeautifulSoup(tr,"html.parser")
>>> for row in table.findAll("tr"):
td1 = row.td.find(text=True, recursive=False)
print str(td1)
text1
text2
text3
答案 1 :(得分:1)
你可以试试这个:
for row in table.findAll("tr"):
td = row.findAll("td")
t = td[0]
print t.contents[0]
但是只有在