我正在尝试使用此代码搜索一些统计信息,包括来自Gosugamers的匹配结果和团队名称:
from bs4 import BeautifulSoup
import requests
for i in range(411):
try:
i += 1
print(i)
url = 'http://www.gosugamers.net/counterstrike/gosubet?r-page={}'.format(i)
r = requests.get(url)
web = BeautifulSoup(r.content,"html.parser")
table = web.findAll("table", attrs={"class":"simple matches"})
table = table[1]
links = table('a')
for link in links:
if 'matches' in link.get('href', None):
if len(link.get('href', None)) != 0:
print(link.get('href', None))
except:
pass
但当我在单个网页上获得link.get('href', None)
包含所有链接的字符串时,我不知道如何将其转换为列表所有链接,如果有人可以帮助我,我会很高兴,谢谢!
答案 0 :(得分:1)
对我而言,似乎link.get('href', None)
实际上返回了一个链接。 get 方法文档说:
get(self,key,default = None)方法
Returns the value of the 'key' attribute for the tag, or
the value given for 'default' if it doesn't have that
attribute.
因此,当您获得其中包含“匹配”的链接时,您只需将其添加到列表中即可。
from bs4 import BeautifulSoup
import requests
all_links = []
i = 1
for i in range(411):
try:
print(i)
url = 'http://www.gosugamers.net/counterstrike/gosubet?r-page={}'.format(i)
r = requests.get(url)
web = BeautifulSoup(r.content,"html.parser")
table = web.findAll("table", attrs={"class":"simple matches"})
table = table[1]
links = table('a')
for link in links:
href = link.get('href')
if href is not None and 'matches' in href:
all_links.append(href)
i += 1
except:
pass
print "Here are all the links: ", all_links