Question

我有一个代码来删除head标签中的文本。给我们一个网站的HTML

    for link in soup.findAll('head'):
        link.replaceWith("")

我正在尝试用“”替换整个内容。但这不起作用。如何完全删除汤中头标记之间的所有文本。

Answer 1

试试这个：

[head.extract() for head in soup.findAll('head')]

Answer 2

您需要使用“”（3引号），而您似乎只使用了两个。

示例：

"""
This block
is commented out
"""

快乐的编码！

编辑：这不是用户问的问题，道歉。

我对Beautiful Soup没有经验，但我在SO上发现了一段可能对您有用的代码（source）：

soup = BeautifulSoup(source.lower())
to_extract = soup.findAll('ahref') #Edit the stuff inside '' to change which tag you want items to be removed from, like 'ahref' or 'head'
for item in to_extract:
    item.extract()

从表面上看，可能会删除您网页上的所有链接。

如果这对你没有帮助，我很抱歉！

在Beautiful Soup中用空字符串替换标题内容

2 个答案: