对How can I strip comment tags from HTML using BeautifulSoup?的评论,我正在尝试从以下标记中删除评论
>>> h
<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>
我的代码 -
comments = h.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print h
但搜索评论却一无所获。我想从上面的标签中提取2个值 - “52周高/低:”和“₹394.00 /₹252.10”。
我也尝试使用
从整个html中删除标签soup = BeautifulSoup(html)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print soup
但是评论仍在那里..有什么建议吗?
答案 0 :(得分:1)
您使用的是public class AdminMenu : INavigationProvider {
public Localizer T { get; set; }
public string MenuName {
get { return "admin"; }
}
public void GetNavigation(NavigationBuilder builder) {
builder
.Add(T("Your Content Type Display Name"), "1", menu => menu
.Action("List", "Admin", new { area = "Contents", id = "YourContentTypeName" }));
}
}
和Python2.7
吗?如果不是后者,我会安装BeautifulSoup4
。
BeautifulSoup4
以下脚本适合我。我刚从上面的问题中复制并粘贴并运行它。
pip install beautifulsoup4
注意:您发布
from bs4 import BeautifulSoup, Comment html = """<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>""" soup = BeautifulSoup(html) comments = soup.findAll(text=lambda text:isinstance(text, Comment)) # nit: It isn't good practice to use a list comprehension only for its # side-effects. (Wastes space constructing an unused list) for comment in comments: comment.extract() print soup
声明是件好事。不会知道它是Python 2。发布Python版本也有帮助。