美丽的汤刮不好建的桌子

时间:2016-03-15 13:46:19

标签: python html python-2.7 web-scraping beautifulsoup

我试图建立一个刮刀来捕捉联赛的名字和赔率.Been能够抓住赔率但是匹配这两个似乎没有用......我的代码是:

import requests
from bs4 import BeautifulSoup

r = requests.get("http://www.elitebetkenya.com/coupon.php")
soup = BeautifulSoup(r.content)

for i in soup.findAll("tr"):
    tds =  i.findAll("td")
    fixture = soup.findAll("tr", { "class" : "fixture" })

    try:
        if len(tds[0].text)  != 0 :

            print " Bet-type: %s, Choice: %s, Match code: %s, 1: %s, 0:      %s" % \
          (tds[0].text, tds[1].text, tds[2].text,tds[3].text, tds[4].text)
except:
    pass

2 个答案:

答案 0 :(得分:0)

以下是您开始使用的内容。

标记本身并不方便刮擦,但我们可以依靠leaguefixture行来区分夹具和夹具,并使用find_next_siblings() method来获取赔率行桌子。

完整的工作示例(根据您的需要进行调整):

import requests
from bs4 import BeautifulSoup

r = requests.get("http://www.elitebetkenya.com/coupon.php")
soup = BeautifulSoup(r.content, "html.parser")


for league in soup.find_all("tr", class_="league"):
    # get basic league and fixture data
    league_name = league.get_text(strip=True)

    fixture = league.find_next_sibling("tr", class_="fixture")
    home = fixture.find("span", class_="home").get_text(strip=True)
    away = fixture.find("span", class_="away").get_text(strip=True)

    date = fixture.br.find_next_sibling(text=True).strip()

    print(league_name, home, away, date)

    # iterate over odds rows
    for odd in fixture.find_next_siblings("tr"):
        # stop once the next league is met
        if "league" in odd.get("class", []):
            break

        # skipping header rows
        if not odd.td.get_text(strip=True):
            continue

        cells = [td.get_text(strip=True) for td in odd.find_all("td")]
        print(cells)

答案 1 :(得分:0)

导入请求 来自bs4 import BeautifulSoup

r = requests.get(" http://www.elitebetkenya.com/coupon.php?d") 汤= BeautifulSoup(r.content)

docu = soup.prettify()

我在汤.findAll(" tr"):     tds = i.findAll(" td")

try:
    home = i.find("span", {"class" :"home"}).get_text(strip=True)
    away = i.find("span", class_="away").get_text(strip=True)
    string = tds[0].text
    g= [home,away,"vs"]
    for i in g:
        string.replace(i, "")


    print string


    # print "%s vs %s" %(home,away)
except:
    pass
try:


    if len(tds[0].text)  != 0:



        print " Type: %s Choice: %s, Match code: %s, 1: %s, 0: %s 2: %s" % \
          ( tds[0].text,tds[1].text, tds[2].text,tds[3].text, tds[4].text, tds[5].text)

except:
    pass