我试图建立一个刮刀来捕捉联赛的名字和赔率.Been能够抓住赔率但是匹配这两个似乎没有用......我的代码是:
import requests
from bs4 import BeautifulSoup
r = requests.get("http://www.elitebetkenya.com/coupon.php")
soup = BeautifulSoup(r.content)
for i in soup.findAll("tr"):
tds = i.findAll("td")
fixture = soup.findAll("tr", { "class" : "fixture" })
try:
if len(tds[0].text) != 0 :
print " Bet-type: %s, Choice: %s, Match code: %s, 1: %s, 0: %s" % \
(tds[0].text, tds[1].text, tds[2].text,tds[3].text, tds[4].text)
except:
pass
答案 0 :(得分:0)
以下是您开始使用的内容。
标记本身并不方便刮擦,但我们可以依靠league
和fixture
行来区分夹具和夹具,并使用find_next_siblings()
method来获取赔率行桌子。
完整的工作示例(根据您的需要进行调整):
import requests
from bs4 import BeautifulSoup
r = requests.get("http://www.elitebetkenya.com/coupon.php")
soup = BeautifulSoup(r.content, "html.parser")
for league in soup.find_all("tr", class_="league"):
# get basic league and fixture data
league_name = league.get_text(strip=True)
fixture = league.find_next_sibling("tr", class_="fixture")
home = fixture.find("span", class_="home").get_text(strip=True)
away = fixture.find("span", class_="away").get_text(strip=True)
date = fixture.br.find_next_sibling(text=True).strip()
print(league_name, home, away, date)
# iterate over odds rows
for odd in fixture.find_next_siblings("tr"):
# stop once the next league is met
if "league" in odd.get("class", []):
break
# skipping header rows
if not odd.td.get_text(strip=True):
continue
cells = [td.get_text(strip=True) for td in odd.find_all("td")]
print(cells)
答案 1 :(得分:0)
导入请求 来自bs4 import BeautifulSoup
r = requests.get(" http://www.elitebetkenya.com/coupon.php?d") 汤= BeautifulSoup(r.content)
docu = soup.prettify()
我在汤.findAll(" tr"): tds = i.findAll(" td")
try:
home = i.find("span", {"class" :"home"}).get_text(strip=True)
away = i.find("span", class_="away").get_text(strip=True)
string = tds[0].text
g= [home,away,"vs"]
for i in g:
string.replace(i, "")
print string
# print "%s vs %s" %(home,away)
except:
pass
try:
if len(tds[0].text) != 0:
print " Type: %s Choice: %s, Match code: %s, 1: %s, 0: %s 2: %s" % \
( tds[0].text,tds[1].text, tds[2].text,tds[3].text, tds[4].text, tds[5].text)
except:
pass