硒不返回当前网址

时间:2019-02-03 21:15:24

标签: python selenium

我正在设置一个python脚本,以访问货运公司的网站并输入跟踪号,并获取装运和交货日期,我在硒current_url方法上遇到了麻烦。我拥有的代码打开了初始浏览器窗口,输入了跟踪号,并转到了发货状态页面,但是在结果页面上找不到表格项。最初,我认为这是我查找表格项的方式,但是我发现结果页面的URL仍然与我开始使用的初始URL相同。我什至添加了一个隐式等待时间,以确保页面已加载并且仍然保持不变。这是我的代码:

我认为在结果页上查找table元素仍然存在问题,但是直到我确定自己是否在搜索正确的url时,我才能确定,因此我需要首先解决该问题。任何帮助将不胜感激。

谢谢 最高

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC


driver = 
webdriver.Chrome('C:/Users/USER/chromedriver_win32/chromedriver.exe')
driver.get("http://www.dovelogistics.com/track-shipment/")

elem = driver.find_element_by_name("txtInputNo")
elem.clear()
elem.send_keys("224893")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source

driver.implicitly_wait(5)

resultsPage = driver.current_url
driver.get(resultsPage)

driver.get("http://206.50.6.81/WebtrakWT/shipinquiry/ShipInfo.aspx? 
 OrderNo=26198&Back=ShipLookup&TrackType=HousebillNo&TrackNo=224893")

elem = driver.find_element_by_xpath("//*[@id='Table5']")
print (elem)

driver.close()   

2 个答案:

答案 0 :(得分:2)

您应该接受Jens Dibbern的解决方案,但只是想指出,一旦您的网址传入,您还可以使用熊猫来提取该表进行解析:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd


driver = webdriver.Chrome()
driver.get("http://www.dovelogistics.com/track-shipment/")

elem = driver.find_element_by_name("txtInputNo")
elem.clear()
elem.send_keys("224893")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source

driver.implicitly_wait(5)

driver.switch_to.window(driver.window_handles[1])
resultsPage = driver.current_url
driver.get(resultsPage)

html = driver.page_source

tables = pd.read_html(html)
table = tables[5]

driver.close()

输出:

print (table)
                        0                                                  1
0       Status Updated On                                               Note
1    12/4/2018 1:07:00 PM                  Shipment Status changed to: Rated
2    12/4/2018 1:07:00 PM  Signed for By: Delivered at KWA 1:07:00 PM 12/...
3    12/4/2018 9:37:43 AM                     Email Status Notification Sent
4   12/2/2018 11:50:00 AM      Shipment Status changed to: Shipment Departed
5    12/1/2018 2:12:00 PM       Shipment Status changed to: Shipment Arrived
6   12/1/2018 10:39:00 AM      Shipment Status changed to: Shipment Departed
7    12/1/2018 9:28:00 AM       Shipment Status changed to: Shipment Arrived
8   11/30/2018 2:53:55 PM  Shipment Status changed to: Shipment Departed ...
9   11/28/2018 8:42:23 PM  Shipment Status changed to: On-Hand At Origin ...
10  11/28/2018 5:53:47 PM  Shipment Status changed to: Dispatched for Pickup

答案 1 :(得分:1)

该网站将打开另一个窗口。您必须切换到该窗口。他们不止一次使用表ID标记。这应该有帮助:

driver = webdriver.Chrome()
driver.get('http://www.dovelogistics.com/track-shipment/')
elem = driver.find_element_by_name("txtInputNo")
elem.clear()
elem.send_keys("224893")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source

driver.switch_to.window(driver.window_handles[1])
print(driver.current_url)

elem = driver.find_elements_by_id("Table1")
print (elem)

driver.close()

您必须处理其嵌套表和重复的id标签。