Question

我正在尝试从this website

的<b>标签中提取内容

我想通过输入地址来提取不同城市的内容。

Query Date: Wed Aug 09 2017
Latitude: 33.4484
Longitude: -112.0740

ASCE 7-10 Windspeeds 
(3-sec peak gust in mph*):

Risk Category I: 105
Risk Category II: 115
Risk Category III-IV: 120
MRI** 10-Year: 76
MRI** 25-Year: 84
MRI** 50-Year: 90
MRI** 100-Year: 96

ASCE 7-05 Windspeed:
  90 (3-sec peak gust in mph)
ASCE 7-93 Windspeed:
  72 (fastest mile in mph)

我尝试过的代码如下所示。

from bs4 import BeautifulSoup
from datetime import datetime
import dateutil.parser
import urllib2
import requests
import sys
import re
import csv
import pandas as pd
from selenium import webdriver

chrome_path = r"/usr/local/share/chromedriver"
driver = webdriver.Chrome(chrome_path)
driver.get("http://windspeed.atcouncil.org/") # opening the site
driver.find_element_by_xpath(
"""//*[@id="address"]""").click() # click the radio button
driver.find_element_by_xpath("""//*[@id="google-map-address"]""").click() # clicking the textbox
cities = ['pheonix'] # city list
for city in cities:
# print (city)
    driver.find_element_by_xpath("""//*[@id="google-map-address"]""").send_keys(city) # passing cities
    driver.find_element_by_xpath("""//*[@id="searchform"]/div[1]/div[2]/button""").click()
    driver.find_element_by_xpath("""// *[ @ id = "latt"]""")
    driver.find_element_by_xpath('//*[@id="searchform"]/div[1]/div[7]/span/input').click()
    x = driver.current_url
print x


Data = {'optionCoordinate': '2','coordinate_address': cities}
page = requests.post(x, data = Data)
soup = BeautifulSoup(page.content,'html.parser') 
for b_tag in soup.find_all('b'):
    print b_tag.text,b_tag.next_sibling

如果可以使用Selenium和Python BS4，请帮助我找到解决方案。

Answer 1

您只需使用selenium即可提取此数据：

Query Date: Wed Aug 09 2017
Latitude: 33.4484
Longitude: -112.0740

ASCE 7-10 Windspeeds
(3-sec peak gust in mph*):

Risk Category I: 105
Risk Category II: 115
Risk Category III-IV: 120
MRI** 10-Year: 76
MRI** 25-Year: 84
MRI** 50-Year: 90
MRI** 100-Year: 96

ASCE 7-05 Windspeed:
90 (3-sec peak gust in mph)
ASCE 7-93 Windspeed:
72 (fastest mile in mph)

输出：

您可以根据需要处理此字符串数据。

{{1}}

在Beautiful Soup或Selenium上的<b>标签内获取数据

1 个答案: