如何从python上的html标签获取某些文本?

时间:2019-04-17 15:55:06

标签: python

我正在通过API创建Python md5解密器,但问题是API正在发回HTML反馈。如何获取<font color=green>之间的文本?

{"error":0,"msg":"<font color=blue><b>Live</b></font><font color=green>Jumpman#23</font> | [MD5 Decrypt] .S/C0D3"}

3 个答案:

答案 0 :(得分:2)

我建议将HTML解析器用作Beautiful Soup

>>> from bs4 import BeautifulSoup
>>> d = {"error":0,"msg":"<font color=blue><b>Live</b></font><font color=green>Jumpman#23</font> | [MD5 Decrypt] .S/C0D3"}
>>> soup = BeautifulSoup(d['msg'], 'html.parser')
>>> soup.font.attrs
{'color': 'blue'}

您将获得一个字典,其中包含键,值pars作为属性名称,值。

更新

要获取文字"Jumpman#23"

>>> soup.findAll("font", {"color": "green"})[0].contents[0]
'Jumpman#23'

答案 1 :(得分:0)

如果您知道目标文本正好是from s in skills join u in userSkills on new s.id equals u.skillid into temp from u in temp.DefaultIfEmpty() select new { Skill=s.displayname, HasSkill= us.skillid == NULL ? "0" : "1" } ,则可以使用简单的字符串操作:

<font color=green>

答案 2 :(得分:0)

您可以将mask_1 = (df.Sentences == strings_2_remove) df.loc[mask_1, 'df.Sentences'] = " " 和相邻的同级组合器用于字体标签

queryset_products = list(queryset_products.filter(...))

for product in queryset_products:
   setattr(product, "stock") = some_stock_calc...