Question

我有一个html标签，如下所示

＆＃13;

<a href="http://cwe.mitre.org/data/definitions/134.html">CWE-134</a>

＆＃13;

我想保留

中的href部分

请建议执行此操作的任何步骤

Answer 1

提取物：

a_tag['href']

保存到文件：

with open('output.txt', 'w') as f:
    f.write(a_tag['href'])

将其写入文件，如TXT或CSV。或者将其存储到数据库中。

Answer 2

for _ in soup.find_all('a'):
    print _
    text = re.split(r'">',re.split(r'="', str(_))[-1])[0]
    print text