我正在使用Python从网页中提取数据。该网页有许多具有href属性的锚标签。
例如:
<a class="identifier" href="/ICD10CM/Codes/A00-B99/A15-A19/A18-/A18.17">A18.17</a>
我可以使用
提取这些特定标签对于soup.find_all('a')中的x: 打印(x)的
但是,我只想提取链接的名称(示例中为A18.17)。我怎么能这样做?
感谢。
答案 0 :(得分:2)
{
"SecurityGroups": [
{
"Description": "SG 1",
"IpPermissions": [
{
"PrefixListIds": [],
"FromPort": 22,
"IpRanges": [
{
"CidrIp": "0.0.0.0/0"
}
],
"ToPort": 22,
"IpProtocol": "tcp",
"UserIdGroupPairs": [],
"Ipv6Ranges": []
}
],
"GroupName": "SG 1",
"VpcId": "vpc-12345678",
"OwnerId": "1234567890",
"GroupId": "sg-11111111"
},
{
"Description": "SG 2",
"IpPermissions": [
{
"PrefixListIds": [],
"FromPort": 22,
"IpRanges": [],
"ToPort": 22,
"IpProtocol": "tcp",
"UserIdGroupPairs": [
{
"UserId": "1234567890",
"GroupId": "sg-abcdefab"
}
],
"Ipv6Ranges": []
},
{
"PrefixListIds": [],
"FromPort": 443,
"IpRanges": [
{
"CidrIp": "0.0.0.0/0"
}
],
"ToPort": 443,
"IpProtocol": "tcp",
"UserIdGroupPairs": [],
"Ipv6Ranges": []
}
],
"GroupName": "SG 2",
"VpcId": "vpc-12345678",
"OwnerId": "1234567890",
"GroupId": "sg-22222222"
}
]
}
有一个类似的问题已解答here,documentation应该有所帮助。