使用bs4进行Web抓取

时间:2016-09-02 18:01:04

标签: python-2.7 web-scraping bs4

import requests
from bs4 import BeautifulSoup

url = "http://bet.hkjc.com/football/index.aspx?lang=en"
r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")

div = soup.find("div", {"class": "footballmaincontent"})
tables = div.find_all("table")
my_table = tables[2]

for row in my_table.find_all('tr'):
    cols = row.find_all('td')

    odds_list = []
    if len(cols) >= 10:
        match_no = (cols[0].text.strip())
        teams = (cols[2].text.strip())
        match_time = (cols[4].text.strip())
        home_odds = (cols[7].text.strip())
        away_odds = (cols[8].text.strip())
        draw_odds = (cols[9].text.strip())

        odds_row = (match_no,teams,match_time,home_odds,away_odds,draw_odds)
        odds_list.append(odds_row)

# Write to csv file
import csv
with open("odds_file.csv", "wb") as file:
    writer = csv.writer(file)
    for row in odds_list:
        writer.writerow(row)

我尝试将列附加到csv文件,方法是将它们附加到" odds_list"在for循环中。但结果却没有在" odds_file"。

中写任何东西

我知道

有问题
odds_row = (match_no,teams,match_time,home_odds,away_odds,draw_odds)

但是如何将我制作的列表附加到csv文件中?

1 个答案:

答案 0 :(得分:-1)

您有my_table所以使用findfind_allmy_table一起获取<tr>及更晚<td>然后您可以获得text 1}}来自<td>

修改

import requests
from bs4 import BeautifulSoup

url = "http://bet.hkjc.com/football/index.aspx?lang=en"
r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")

div = soup.find("div", {"class": "footballmaincontent"})
tables = div.find_all("table")
my_table = tables[2]

for row in my_table.find_all('tr'):
    cols = row.find_all('td')
    if len(cols) >= 10:
        print(cols[0].text.strip(),'|',end='')
        print(cols[2].text.strip(),'|',end='')
        print(cols[4].text.strip(),'|',end='')
        print(cols[7].text.strip(),'|',end='')
        print(cols[8].text.strip(),'|',end='')
        print(cols[9].text.strip(),'|',end='')
        print()
        print('-'*40)

结果

Match No. |Teams(Home vs Away) |Expected StopSelling Time |Home/Away/Draw | | |
----------------------------------------
FRI 9 |Romania U21 vs Luxembourg U21 |03/09 01:30 |Accept In Play Betting Only | | |
----------------------------------------
FRI 13 |St. Vincent and Grenadines vs USA |03/09 03:30 |35.00 |13.00 |1.02 |
----------------------------------------
FRI 14 |Honduras vs Canada |03/09 05:06 |1.45 |3.55 |6.50 |
----------------------------------------
FRI 15 |Trinidad and Tobago vs Guatemala |03/09 07:00 |1.67 |3.20 |4.70 |
----------------------------------------