用于从网站获取数据的Python代码

时间:2013-09-06 07:14:03

标签: python beautifulsoup

我正在使用以下代码。 得到[]。 请帮我找错。

from urllib import urlopen
optionsUrl = 'http://www.moneycontrol.com/commodity/'
optionsPage = urlopen(optionsUrl)
from bs4 import BeautifulSoup
soup = BeautifulSoup(optionsPage)
print soup.findAll(text='MCX')

1 个答案:

答案 0 :(得分:2)

这将为您获取该商品列表(在python 2.7上测试)。您需要隔离该商品表,然后沿着行向下工作,读取每一行并从每列中提取数据

import urllib2
import bs4

# Page with commodities
URL = "http://www.moneycontrol.com/commodity/"

# Download the page data and create a BeautitulSoup object
commodityPage = urllib2.urlopen(URL)
commodityPageText = commodityPage.read()
commodityPageSoup = bs4.BeautifulSoup(commodityPageText)

# Extract the div with the commodities table and find all the table rows
commodityTable = commodityPageSoup.find_all("div", "equity com_ne")
commodittTableRows =  commodityTable[0].find_all("tr")

# Trim off the table header row
commodittTableRows = commodittTableRows[1:]

# Iterate over the table rows and print out the commodity name and price
for commodity in commodittTableRows:
    # Separate all the table columns
    columns = commodity.find_all("td")

    # -------------- Get the values from each column
    # ROW 1: Name and date
    nameAndDate = columns[0].text
    nameAndDate = nameAndDate.split('-')
    name = nameAndDate[0].strip()
    date = nameAndDate[1].strip()

    # ROW 2: Price
    price = float(columns[1].text)

    # ROW 3: Change
    change = columns[2].text.replace(',', '') # Remove commas from change value
    change = float(change)

    # ROW 4: Percentage change
    percentageChange = columns[3].text.replace('%', '') # Remove the percentage symbol
    percentageChange = float(percentageChange)

    # Print out the data
    print "%s on %s was %.2f - a change of %.2f (%.2f%%)" % (name, date, price, change, percentageChange)

给出了结果

Gold on 5 Oct was 30068.00 - a change of 497.00 (1.68%)
Silver on 5 Dec was 50525.00 - a change of 1115.00 (2.26%)
Crudeoil on 19 Sep was 6924.00 - a change of 93.00 (1.36%)
Naturalgas on 25 Sep was 233.80 - a change of 0.30 (0.13%)
Aluminium on 30 Sep was 112.25 - a change of 0.55 (0.49%)
Copper on 29 Nov was 459.80 - a change of 3.40 (0.74%)
Nickel on 30 Sep was 882.20 - a change of 5.90 (0.67%)
Lead on 30 Sep was 131.80 - a change of 0.70 (0.53%)
Zinc on 30 Sep was 117.85 - a change of 0.75 (0.64%)
Menthaoil on 30 Sep was 871.90 - a change of 1.80 (0.21%)
Cotton on 31 Oct was 21350.00 - a change of 160.00 (0.76%)