Web动态抓取内容包装器数据

时间:2020-02-28 04:59:29

标签: python selenium beautifulsoup

在这里,https://forecast.vassarlabs.com/,我想提取降雨表数据;为了检查桌子,您必须单击右上角的菜单图标。我要提取此数据表。

我的代码:

from urllib.request import urlopen

import requests

from bs4 import BeautifulSoup

from selenium import webdriver

import os 


driver = webdriver.Chrome("C:/Users/DELL/Desktop/chromedriver.exe")

driver.get('https://forecast.vassarlabs.com/')

res = driver.execute_script("return document.documentElement.outerHTML")

driver.quit()

soup = BeautifulSoup(res, 'html.parser')


table = soup.find("div",class_='content-wrap')



ta=table.find('table',class_="table table-bordered table-striped responsive no-m ng-scope")

print(ta)

2 个答案:

答案 0 :(得分:1)

您也可以避免使用硒,并直接从API获取硒。您只需要做的就是解析json响应:

import requests
import pandas as pd


url = 'https://forecast.vassarlabs.com/api/commanddashboard/getdashboarddata/RAINFALL/24%20Hrs/District'

jsonData = requests.get(url).json()
jsonData = jsonData['Andhra Pradesh']

results = pd.DataFrame()
for k,v in jsonData.items():
    row = pd.DataFrame(v['rainfallDataMap']).head(1)
    row['District'] = k
    results = results.append(row, sort=False).reset_index(drop=True)

输出:

print (results.to_string())
    05:30-11:30  11:30-17:30  17:30-23:30  23:30-05:30  24 Hrs       District
0           0.0         0.44         0.00          0.0    0.44  East Godavari
1           0.0         0.00         0.00          0.0    0.00      Anantapur
2           0.0         0.00         0.00          0.0    0.00         Kadapa
3           0.0         0.00         0.00          0.0    0.00        Nellore
4           0.0         0.07         0.00          0.0    0.07  West Godavari
5           0.0         0.00         0.00          0.0    0.00     Srikakulam
6           0.0         0.00         0.00          0.0    0.00        Kurnool
7           0.0         0.00         0.00          0.0    0.00       Chittoor
8           0.0         0.00         0.00          0.0    0.00        Krishna
9           0.0         1.05         0.00          0.0    1.05  Visakhapatnam
10          0.0         0.00         0.00          0.0    0.00       Prakasam
11          0.0         0.12         0.00          0.0    0.12          Total
12          0.0         0.21         0.01          0.0    0.22   Vizianagaram
13          0.0         0.00         0.00          0.0    0.00         Guntur

答案 1 :(得分:0)

from selenium import webdriver
import pandas as pd
from selenium.webdriver.firefox.options import Options

options = Options()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)
driver.implicitly_wait(10)
driver.get("https://forecast.vassarlabs.com/")
myDynamicElement = driver.find_element_by_css_selector(
    ".fa-list").click()

df = pd.read_html(driver.page_source)[1]

pd.DataFrame(df).to_csv("result.csv", index=False)

driver.quit()

输出:Check-Online

enter image description here