Question

我正在使用Python从网页中提取数据。该网页有许多具有href属性的锚标签。

例如：

<a class="identifier" href="/ICD10CM/Codes/A00-B99/A15-A19/A18-/A18.17">A18.17</a>

我可以使用

提取这些特定标签

对于soup.find_all（'a'）中的x：打印（x）的

但是，我只想提取链接的名称（示例中为A18.17）。我怎么能这样做？

感谢。

Answer 1

{
    "SecurityGroups": [
        {
            "Description": "SG 1", 
            "IpPermissions": [
                {
                    "PrefixListIds": [], 
                    "FromPort": 22, 
                    "IpRanges": [
                        {
                            "CidrIp": "0.0.0.0/0"
                        }
                    ], 
                    "ToPort": 22, 
                    "IpProtocol": "tcp", 
                    "UserIdGroupPairs": [], 
                    "Ipv6Ranges": []
                }
            ], 
            "GroupName": "SG 1",
            "VpcId": "vpc-12345678", 
            "OwnerId": "1234567890", 
            "GroupId": "sg-11111111"
        }, 
        {
            "Description": "SG 2", 
            "IpPermissions": [
                {
                    "PrefixListIds": [], 
                    "FromPort": 22, 
                    "IpRanges": [], 
                    "ToPort": 22, 
                    "IpProtocol": "tcp", 
                    "UserIdGroupPairs": [
                        {
                            "UserId": "1234567890", 
                            "GroupId": "sg-abcdefab"
                        }
                    ], 
                    "Ipv6Ranges": []
                },
                {
                    "PrefixListIds": [], 
                    "FromPort": 443, 
                    "IpRanges": [
                        {
                            "CidrIp": "0.0.0.0/0"
                        }
                    ], 
                    "ToPort": 443, 
                    "IpProtocol": "tcp", 
                    "UserIdGroupPairs": [], 
                    "Ipv6Ranges": []
                }
            ], 
            "GroupName": "SG 2", 
            "VpcId": "vpc-12345678", 
            "OwnerId": "1234567890", 
            "GroupId": "sg-22222222"
        } 
    ]
}

有一个类似的问题已解答here，documentation应该有所帮助。

Python - 提取href超链接的名称

1 个答案: