在python中调用URL的正确方法是什么

时间:2017-04-12 04:19:56

标签: python python-3.x csv selenium web-scraping

我有csv文件并且传递csv数据有参数到python代码。在csv文件中有URL数据。在python中调用URL的正确方法是什么。得到错误Cannot navigate to invalid URL

CSV文件

ID,category,link
sports_shoes,sports-shoes,https://www.flipkart.com/mens-footwear/sports-shoes/pr?otracker=categorytree&page=1&sid=osp%2Ccil%2C1cu

代码:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
import time
import csv

with open('mydata.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        #print(row['ID'] ,row['category'],row['link'])
        url = row['link']
        print(url)
        chrome_path = r"C:\Users\Venkatesh\AppData\Local\Programs\Python\Python35\chromedriver.exe"
        driver = webdriver.Chrome(chrome_path)
        RegionIDArray = url
        data_list=[]
        data = []
        mobile_details_data = []
        delay = 30 # seconds
        for reg in RegionIDArray:
            driver.get(reg)
driver.quit()

错误:

Traceback (most recent call last):
  File ".\input_file.py", line 24, in <module>
    driver.get(reg)
  File "C:\Users\Venkatesh\AppData\Local\Programs\Python\Python35\Lib\site-packages\selenium\webdriver\remote\webdriver.
py", line 250, in get
    self.execute(Command.GET, {'url': url})
  File "C:\Users\Venkatesh\AppData\Local\Programs\Python\Python35\Lib\site-packages\selenium\webdriver\remote\webdriver.
py", line 238, in execute
    self.error_handler.check_response(response)
  File "C:\Users\Venkatesh\AppData\Local\Programs\Python\Python35\Lib\site-packages\selenium\webdriver\remote\errorhandl
er.py", line 193, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: unhandled inspector error: {"code":-32000,"messag
e":"Cannot navigate to invalid URL"}
  (Session info: chrome=57.0.2987.133)
  (Driver info: chromedriver=2.28.455520 (cc17746adff54984afff480136733114c6b3704b),platform=Windows NT 6.2.9200 x86_64)

1 个答案:

答案 0 :(得分:2)

您的url变量包含您要访问的链接。您的代码循环遍历字符串并对每个字符进行driver.get()调用。这基本上解释了错误。

由于您已经在for row in reader:中循环遍历数据,因此您不需要内循环。只需使用driver.get(url)