我使用Python 2.7在Debian上删除了几个网站,但也许我的代码会自动停止(如果它无法及时加载(冻结)或没有互联网连接)。
是否有任何解决方案可以解决此问题,也许只是跳过问题并将我的代码运行到下一个URL?因为如果我遇到这样的问题,这个脚本就会自动停止..
这是我的代码:
#!/usr/bin/python
#!/bin/sh
# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
from selenium import webdriver
import urllib2
import subprocess
import unicodecsv as csv
import os
import sys
import io
import time
import datetime
import pandas as pd
import MySQLdb
import re
import contextlib
import selenium.webdriver.support.ui as ui
import numpy as np
from datetime import datetime, timedelta
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pyautogui
from pykeyboard import PyKeyboard
reload(sys)
sys.setdefaultencoding('utf-8')
cols = ['MYCOLS..']
browser = webdriver.Firefox()
datatable=[]
browser.get('LINK1')
time.sleep(5)
browser.find_element_by_xpath('//button[contains(text(), "CLICK EVENT")]').click()
time.sleep(5)
browser.find_element_by_xpath('//button[contains(text(), "CLICK EVENT")]').click()
html = browser.page_source
soup=BeautifulSoup(html,"html.parser")
table = soup.find('table', { "class" : "table table-condensed table-hover data-table m-n-t-15" })
for record in table.find_all('tr', class_="hidden-xs hidden-sm ng-scope"):
for data in record.find_all("td"):
temp_data.append(data.text.encode('utf-8'))
newlist = filter(None, temp_data)
datatable.append(newlist)
time.sleep(10)
browser.close()
#HERE I INSERT MY DATAES INTO MYSQL..IT IS NOT IMPORTANT, AND MY SECOND LINK STARTING HERE
browser = webdriver.Firefox()
datatable=[]
browser.get('LINK2')
browser.find_element_by_xpath('//button[contains(text(), "LCLICK EVENT")]').click()
time.sleep(5)
html = browser.page_source
soup=BeautifulSoup(html,"html.parser")
table = soup.find('table', { "class" : "table table-condensed table-hover data-table m-n-t-15" })
for record in table.find_all('tr', class_="hidden-xs hidden-sm ng-scope"):
for data in record.find_all("td"):
temp_data.append(data.text.encode('utf-8'))
newlist = filter(None, temp_data)
datatable.append(newlist)
time.sleep(10)
browser.close()
#MYSQLDB PART AGAIN...AND THE NEXT LINK IS COMING.
+1编辑:
当剧本找不到此CLICK EVENT时也会停止。为什么?我怎么能避免这个?
答案 0 :(得分:0)
使用Selenium,您可以配置驱动程序(浏览器对象)以等待特定元素或条件。然后,您可以使用常规的try / except来处理任何错误,例如TimeoutException
或许多其他错误。
Selenium在their documentation上很好地解释了等待系统。
以下是Selenium上的异常处理代码段:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
try:
# Wait for any element / condition, you can even had lambda if you wish to
WebDriverWait(browser, 10).until(
EC.visibility_of_all_elements_located((By.ID, 'my-item'))
)
except TimeoutException:
# Here I raise an error but you can do whatever you want like exiting properly or logging something
raise RuntimeError('No Internet connection')