此代码的目的是从一些链接中刮取数据表,然后将其转换为熊猫数据框。
问题在于,此代码仅只刮擦表第一页中的前7行,而我想捕获整个表。 因此,当我尝试遍历表格页面时,出现错误。
代码如下:
from selenium import webdriver
urls = open(r"C:\Users\Sayed\Desktop\script\sample.txt").readlines()
for url in urls:
driver = webdriver.Chrome(r"D:\Projects\Tutorial\Driver\chromedriver.exe")
driver.get(url)
for item in driver.find_element_by_xpath('//*[contains(@id,"showMoreHistory")]/a'):
driver.execute_script("arguments[0].click();", item)
for table in driver.find_elements_by_xpath('//*[contains(@id,"eventHistoryTable")]//tr'):
data = [item.text for item in table.find_elements_by_xpath(".//*[self::td or self::th]")]
print(data)
这是错误:
回溯(最近通话最近一次):
文件“ D:/Projects/Tutorial/ff.py”,第8行 对于driver.find_element_by_xpath('// * [包含(@id,“ showMoreHistory”)] / a')中的项目:
TypeError:“ WebElement”对象不可迭代
答案 0 :(得分:1)
查看以下脚本,从该网页获取整个表格。我在脚本中使用了经过编码的延迟,这不是一个好习惯。但是,您始终可以定义import pandas as pd
def dummy():
df=pd.read_csv('DF.csv',header=0)
region_list = ['North', 'South', 'Central', 'West', 'East']
for region in region_list:
df[region] = 0
for i in range(len(df['Region'])):
for region in region_list:
if df['Region'][i]== region:
df[region][i]=1
housing_list = ['apartment', 'house', 'townhouse', 'unit', 'villa', 'acreage', 'other']
for item in housing_list:
df[item] = 0
for i in range(len(df['Type_Property'])):
for item in housing_list:
if df['Type_Property'][i]== item:
df[item][i]=1
df.to_csv('Dummied.csv')
dummy()
来使代码更健壮:
Explicit Wait
要获取耗尽import time
from selenium import webdriver
url = 'https://www.investing.com/economic-calendar/investing.com-eur-usd-index-1155'
driver = webdriver.Chrome()
driver.get(url)
item = driver.find_element_by_xpath('//*[contains(@id,"showMoreHistory")]/a')
driver.execute_script("arguments[0].click();", item)
time.sleep(2)
for table in driver.find_elements_by_xpath('//*[contains(@id,"eventHistoryTable")]//tr'):
data = [item.text for item in table.find_elements_by_xpath(".//*[self::td or self::th]")]
print(data)
driver.quit()
按钮以及定义show more
的所有数据,您可以尝试以下脚本:
Explicit Wait
答案 1 :(得分:0)
根据您的问题和网址https://www.investing.com/economic-calendar/investing.com-eur-usd-index-1155
来抓取整个表格,您可以使用以下解决方案:
代码块:
# -*- coding: UTF-8 -*-
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
table_rows = []
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://www.investing.com/economic-calendar/investing.com-eur-usd-index-1155")
show_more_button = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr>th.left.symbol")))
driver.execute_script("arguments[0].scrollIntoView(true);",show_more_button);
myLength = len(WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr[event_attr_id='1155']"))))
while True:
try:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div#showMoreHistory1155>a"))).click()
WebDriverWait(driver, 20).until(lambda driver: len(driver.find_elements_by_css_selector("table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr[event_attr_id='1155']")) > myLength)
table_rows = driver.find_elements_by_css_selector("table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr[event_attr_id='1155']")
myLength = len(table_rows)
except TimeoutException:
break
for row in table_rows:
print(row.text)
driver.quit()
控制台输出:
Sep 24, 2018 01:30
Sep 17, 2018 01:30 53.1% 55.3%
Sep 10, 2018 01:30 55.3% 49.0%
Sep 03, 2018 01:30 49.0% 43.3%
Aug 27, 2018 01:30 43.3% 49.7%
Aug 20, 2018 01:30 49.7% 52.5%
Aug 13, 2018 01:30 52.5% 59.9%
Aug 06, 2018 01:30 59.9% 62.6%
Jul 30, 2018 01:30 62.6% 52.8%
Jul 23, 2018 01:30 52.8% 52.7%
Jul 16, 2018 01:30 52.7% 46.2%
Jul 10, 2018 01:30 46.2% 55.3%
Jul 02, 2018 01:30 55.3% 53.1%
Jun 25, 2018 01:30 53.1% 66.2%
Jun 18, 2018 01:30 66.2% 65.2%
Jun 11, 2018 01:30 65.2% 61.2%
Jun 04, 2018 01:30 61.2% 63.9%
May 28, 2018 01:30 63.9% 67.0%
May 21, 2018 01:30 67.0% 63.2%
May 14, 2018 01:30 63.2% 61.3%
May 07, 2018 01:30 61.3% 57.6%
Apr 30, 2018 01:30 57.6% 64.8%
Apr 23, 2018 01:30 64.8% 65.2%
Apr 16, 2018 01:30 65.2% 60.4%
Apr 09, 2018 01:30 60.4% 63.3%
Apr 02, 2018 01:30 63.3% 62.1%
Mar 26, 2018 01:30 62.1% 65.7%
Mar 19, 2018 02:30 65.7% 56.0%
Mar 12, 2018 02:30 56.0% 62.3%
Mar 05, 2018 02:30 62.3% 59.1%
Feb 26, 2018 02:30 59.1% 52.8%
Feb 19, 2018 02:30 52.8% 55.8%
Feb 12, 2018 02:30 55.8% 51.7%
Feb 05, 2018 02:30 51.7% 56.8%
Jan 29, 2018 02:30 56.8% 52.2%
Jan 22, 2018 02:30 52.2% 56.1%
Jan 15, 2018 02:30 56.1% 60.2%
Jan 08, 2018 02:30 60.2% 54.6%
Jan 01, 2018 02:30 54.6% 48.4%
Dec 25, 2017 02:30 48.4% 66.4%
Dec 18, 2017 02:30 66.4% 58.9%
Dec 11, 2017 02:30 58.9% 53.8%
Dec 04, 2017 02:30 53.8% 55.9%
Nov 28, 2017 02:30 55.9% 53.7%
Nov 20, 2017 02:30 53.7% 58.6%
Nov 14, 2017 02:30 58.6% 52.8%
Nov 06, 2017 02:30 52.8% 57.6%
Oct 30, 2017 01:30 57.6% 54.7%
Oct 23, 2017 01:30 54.7% 58.9%
Oct 16, 2017 01:30 58.9% 57.3%
Oct 09, 2017 01:30 57.3% 64.0%
Oct 02, 2017 01:30 64.0% 47.5%
Sep 25, 2017 01:30 47.5% 52.2%
Sep 18, 2017 01:30 52.2% 55.5%
Sep 11, 2017 01:30 55.5% 54.3%
Sep 04, 2017 01:30 54.3% 54.2%
Aug 28, 2017 01:30 54.2% 51.4%
Aug 21, 2017 01:30 51.4% 57.4%
Aug 14, 2017 01:30 57.4% 51.2%
Aug 07, 2017 01:30 51.2% 51.3%
Jul 31, 2017 01:30 51.3% 52.8%
Jul 24, 2017 01:30 52.8% 53.3%
Jul 17, 2017 01:30 53.3% 54.1%
Jul 10, 2017 01:30 54.1% 51.9%
Jul 03, 2017 01:30 51.9% 40.6%
Jun 26, 2017 01:30 40.6% 52.6%
Jun 19, 2017 01:30 52.6% 51.0%
Jun 12, 2017 01:30 51.0% 52.1%
Jun 05, 2017 01:30 52.1% 59.1%
May 29, 2017 01:30 59.1% 46.9%
May 22, 2017 01:30 46.9% 53.0%
May 15, 2017 01:30 53.0% 44.9%
May 08, 2017 01:30 44.9% 37.0%
May 01, 2017 01:30 37.0% 43.0%
Apr 24, 2017 01:30 43.0% 52.4%
Apr 10, 2017 01:30 52.4% 55.1%
Apr 03, 2017 01:30 55.1% 43.5%
Mar 27, 2017 02:30 43.5% 36.0%
Mar 20, 2017 02:30 36.0% 32.3%
Mar 13, 2017 02:30 32.3% 42.8%
Mar 06, 2017 02:30 42.8% 39.1%
Feb 27, 2017 02:30 39.1% 41.7%
Feb 20, 2017 02:30 41.7% 43.2%
Feb 13, 2017 02:30 43.2% 36.6%
Feb 06, 2017 02:30 36.6% 39.7%
Jan 30, 2017 02:30 39.7% 33.5%
Jan 23, 2017 02:30 33.5% 36.8%
Jan 16, 2017 03:30 36.8% 37.0%
Jan 09, 2017 02:30 37.0% 41.6%
Jan 02, 2017 02:30 41.6% 35.8%
Dec 26, 2016 02:30 35.8% 42.3%
Dec 19, 2016 02:30 42.3% 39.7%
Dec 12, 2016 04:15 39.7% 33.8%
Dec 05, 2016 02:30 33.8% 37.1%
Nov 29, 2016 02:30 37.1% 41.9%
Nov 21, 2016 02:30 41.9% 39.1%
Nov 15, 2016 02:00 39.1% 20.5%
Nov 07, 2016 02:30 20.5% 27.4%
Oct 31, 2016 02:30 27.4% 33.4%
Oct 25, 2016 02:30 33.4% 30.8%
Oct 18, 2016 02:30 30.8% 26.6%
Oct 10, 2016 02:30 26.6% 28.6%
Oct 05, 2016 02:00 28.6% 26.2%
Sep 26, 2016 02:30 26.2% 34.8%
Sep 19, 2016 02:30 34.8% 21.2%
Sep 13, 2016 02:30 21.2% 27.0%
Sep 05, 2016 02:30 27.0% 32.7%
Aug 29, 2016 02:30 32.7% 23.9%
Aug 22, 2016 02:30 23.9% 28.8%
Aug 15, 2016 02:30 28.8% 30.8%
Aug 08, 2016 02:30 30.8% 20.3%
Aug 01, 2016 02:30 20.3% 30.2%
Jul 25, 2016 02:30 30.2% 29.5%
Jul 18, 2016 02:30 29.5% 26.2%
Jul 11, 2016 02:30 26.2% 27.5%
Jul 04, 2016 02:30 27.5% 26.8%
Jun 27, 2016 02:30 26.8% 35.1%
Jun 20, 2016 02:30 35.1% 22.8%
Jun 13, 2016 02:30 22.8% 32.5%
Jun 06, 2016 02:30 32.5% 35.6%
May 30, 2016 02:30 35.6% 39.5%
May 23, 2016 02:30 39.5% 37.8%
May 16, 2016 03:30 37.8% 39.5%
May 09, 2016 02:30 39.5% 30.3%
May 02, 2016 02:30 30.3% 32.9%
Apr 25, 2016 02:30 32.9% 29.6%
Apr 18, 2016 06:00 29.6% 30.5%
Apr 11, 2016 02:30 30.5% 22.7%
Apr 04, 2016 03:30 22.7% 32.1%
Mar 28, 2016 03:30 32.1% 23.2%
Mar 21, 2016 03:30 23.2% 26.7%
Mar 14, 2016 03:30 26.7% 22.6%
Mar 07, 2016 03:30 22.6% 33.7%
Feb 29, 2016 03:30 33.7% 34.8%
Feb 22, 2016 03:30 34.8% 33.3%
Feb 15, 2016 03:30 33.3% 33.3%
Feb 08, 2016 03:30 33.3% 34.3%
Feb 01, 2016 03:30 34.3% 33.2%
Jan 25, 2016 03:30 33.2% 27.0%
Jan 18, 2016 03:30 27.0% 27.2%
Jan 11, 2016 03:30 27.2% 30.0%
Jan 05, 2016 03:30 30.0% 24.0%
Dec 29, 2015 03:30 24.0% 33.3%
Dec 21, 2015 03:30 33.3% 31.2%
Dec 14, 2015 04:30 31.2% 27.1%
Dec 07, 2015 03:00 27.1% 29.8%
Dec 01, 2015 03:00 29.8% 27.5%
Nov 23, 2015 03:00 27.5% 33.1%
Nov 17, 2015 04:00 33.1% 26.8%
Nov 09, 2015 02:30 26.8% 24.3%
Nov 02, 2015 01:30 24.3% 36.4%
Oct 26, 2015 01:30 36.4% 28.6%
Oct 19, 2015 01:30 28.6% 25.5%
Oct 11, 2015 04:30 25.5% 29.6%
Oct 06, 2015 01:00 29.6% 28.5%
Sep 28, 2015 01:30 28.5% 29.1%
Sep 21, 2015 01:30 29.1% 21.2%
Sep 14, 2015 01:30 21.2% 29.8%
Sep 07, 2015 01:30 29.8% 36.3%
Aug 31, 2015 01:30 36.3% 35.6%
Aug 24, 2015 01:30 35.6% 26.4%
Aug 17, 2015 01:30 26.4% 24.8%
Aug 10, 2015 01:30 24.8% 29.7%
Aug 03, 2015 01:30 29.7% 24.8%
Jul 27, 2015 01:30 24.8% 30.7%
Jul 20, 2015 01:30 30.7% 27.9%
Jul 13, 2015 01:30 27.9% 27.4%
Jul 07, 2015 01:30 27.4% 26.8%
Jun 29, 2015 01:30 26.8% 33.1%
Jun 22, 2015 01:30 33.1% 33.6%
Jun 15, 2015 03:30 33.6% 28.9%
Jun 08, 2015 01:30 28.9% 23.0%
Jun 01, 2015 01:30 23.0% 34.0%
May 25, 2015 04:00 34.0% 28.9%
May 18, 2015 01:30 28.9% 28.8%
May 11, 2015 01:30 28.8% 28.3%
May 04, 2015 02:00 28.3% 23.7%
Apr 27, 2015 01:30 23.7% 27.2%
Apr 20, 2015 01:30 27.2% 33.7%
Apr 13, 2015 02:00 33.7% 23.2%
Apr 06, 2015 02:00 23.2% 19.8%
Mar 30, 2015 02:30 19.8% 24.1%
Mar 23, 2015 02:30 24.1% 27.2%
Mar 16, 2015 03:00 27.2% 35.6%
Mar 09, 2015 02:30 35.6% 34.4%
Mar 02, 2015 02:30 34.4% 30.2%
Feb 23, 2015 02:30 30.2% 26.6%
Feb 16, 2015 03:30 26.6% 23.8%
Feb 09, 2015 02:30 23.8% 26.4%
Feb 02, 2015 02:30 26.4% 23.9%
Jan 26, 2015 02:30 23.9% 28.9%
Jan 19, 2015 02:30 28.9% 35.5%
Jan 12, 2015 02:30 35.5% 38.1%
Jan 06, 2015 03:30 38.1% 40.6%
Jan 01, 2015 02:30 40.6% 45.2%
Dec 22, 2014 02:00 45.2% 39.8%
Dec 15, 2014 02:00 39.8% 41.7%
Dec 07, 2014 21:00 41.7% 33.8%
Dec 02, 2014 03:00 33.8% 38.6%
Nov 24, 2014 01:30 38.6% 39.2%
Nov 17, 2014 01:00 39.2% 33.1%
Nov 10, 2014 01:00 33.1% 35.4%
Nov 04, 2014 03:00 35.4% 37.3%
Oct 27, 2014 02:00 37.3% 33.7%
Oct 19, 2014 22:00 33.7% 36.2%
Oct 13, 2014 01:00 36.2% 44.5%
Oct 06, 2014 01:00 44.5% 41.3%
Sep 29, 2014 01:00 41.3% 50.3%
Sep 21, 2014 22:35 50.3% 39.5%
Sep 15, 2014 00:45 39.5% 39.9%
Sep 08, 2014 01:00 39.9% 42.8%
Sep 01, 2014 02:35 42.8% 41.9%
Aug 25, 2014 01:00 41.9% 38.9%
Aug 18, 2014 01:00 38.9% 34.0%
Aug 11, 2014 01:00 34.0% 38.2%
Aug 04, 2014 01:00 38.2% 38.4%
Jul 28, 2014 01:00 38.4% 42.3%
Jul 21, 2014 01:00 42.3% 37.2%
Jul 14, 2014 01:00 37.2% 39.6%
Jul 07, 2014 01:00 39.6% 39.8%
Jun 30, 2014 01:00 39.8% 36.1%
Jun 23, 2014 00:30 36.1% 37.6%
Jun 16, 2014 00:30 37.6% 36.5%
Jun 09, 2014 00:30 36.5% 44.1%
Jun 01, 2014 22:00 44.1% 49.4%
May 26, 2014 00:30 49.4% 41.0%
May 19, 2014 00:00 41.0% 55.0%
May 12, 2014 00:00 55.0% 41.1%
May 04, 2014 06:00 41.1% 43.5%
Apr 27, 2014 06:00 43.5% 40.3%
Apr 06, 2014 06:00 40.3%