Question

我正在尝试使用此代码搜索一些统计信息，包括来自Gosugamers的匹配结果和团队名称：

from bs4 import BeautifulSoup
import requests

for i in range(411):
    try:
        i += 1
        print(i)
        url = 'http://www.gosugamers.net/counterstrike/gosubet?r-page={}'.format(i)
        r = requests.get(url)
        web = BeautifulSoup(r.content,"html.parser")
        table = web.findAll("table", attrs={"class":"simple matches"})
        table = table[1]
        links = table('a')
        for link in links:
            if 'matches' in link.get('href', None):
                if len(link.get('href', None)) != 0:
                    print(link.get('href', None))

    except:
        pass

但当我在单个网页上获得link.get('href', None) 包含所有链接的字符串时，我不知道如何将其转换为列表所有链接，如果有人可以帮助我，我会很高兴，谢谢！

Answer 1

对我而言，似乎link.get('href', None)实际上返回了一个链接。 get 方法文档说：

bs4.element.Tag实例的

get（self，key，default = None）方法

Returns the value of the 'key' attribute for the tag, or
the value given for 'default' if it doesn't have that
attribute.

因此，当您获得其中包含“匹配”的链接时，您只需将其添加到列表中即可。

from bs4 import BeautifulSoup
import requests

all_links = []

i = 1
for i in range(411):
    try:
        print(i)
        url = 'http://www.gosugamers.net/counterstrike/gosubet?r-page={}'.format(i)
        r = requests.get(url)
        web = BeautifulSoup(r.content,"html.parser")
        table = web.findAll("table", attrs={"class":"simple matches"})
        table = table[1]
        links = table('a')

        for link in links:
            href = link.get('href')
            if href is not None and 'matches' in href:
                all_links.append(href)

        i += 1
    except:
        pass

print "Here are all the links: ", all_links

将href的字符串转换为链接列表

1 个答案: