如何格式化凌乱的html源代码?蟒蛇

时间:2018-03-29 07:13:44

标签: python beautifulsoup

我使用BeautifulSoup在python中编写了使用HTML源代码。 我得到的HTML非常混乱。如何使HTML源看起来很好?

这是website

这就是我获取html源代码的一部分(

    property="article:tag" content="ally" /><meta
property="article:tag" content="harvey weinstein" /><meta
property="article:tag" content="pratiksha parulekar" /><meta
property="article:tag" content="rape culture" /><meta
property="article:section" content="No Photo" /><meta
property="article:published_time" content="2017-10-25T22:28:46-05:00" /><meta
property="article:modified_time" content="2017-10-25T22:44:29-05:00" /><meta
property="og:updated_time" content="2017-10-25T22:44:29-05:00" /><meta
name="twitter:card" content="summary" /><meta
name="twitter:description" content="For men, professing disgust at sexual assault allegations is not sufficient; male allies must also hold friends who harass women accountable." /><meta
name="twitter:title" content="To combat sexual harassment, men must hold peers accountable &bull; The Tulane Hullabaloo" /><link
rel='dns-prefetch' href='//cdn.jsdelivr.net' /><link
rel='dns-prefetch' href='//maxcdn.bootstrapcdn.com' /><link
rel='dns-prefetch' href='//fonts.googleapis.com' /><link
rel='dns-prefetch' href='//s.w.org' /><link

我该怎么办?

2 个答案:

答案 0 :(得分:2)

您可能正在寻找the doc

print(yoursoup.prettify())

答案 1 :(得分:1)

如果您想要“美化”HTML,就像使用Beautiful Soup一样,您可以在here中执行某些操作。

请记住,自该答案以来导入已经改变,现在是:

from bs4 import BeautifulSoup

从那时起,某些字段可能已更改,您可以找到更多示例in the documentation