我正在设置一个python脚本,以访问货运公司的网站并输入跟踪号,并获取装运和交货日期,我在硒current_url方法上遇到了麻烦。我拥有的代码打开了初始浏览器窗口,输入了跟踪号,并转到了发货状态页面,但是在结果页面上找不到表格项。最初,我认为这是我查找表格项的方式,但是我发现结果页面的URL仍然与我开始使用的初始URL相同。我什至添加了一个隐式等待时间,以确保页面已加载并且仍然保持不变。这是我的代码:
我认为在结果页上查找table元素仍然存在问题,但是直到我确定自己是否在搜索正确的url时,我才能确定,因此我需要首先解决该问题。任何帮助将不胜感激。
谢谢 最高
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver =
webdriver.Chrome('C:/Users/USER/chromedriver_win32/chromedriver.exe')
driver.get("http://www.dovelogistics.com/track-shipment/")
elem = driver.find_element_by_name("txtInputNo")
elem.clear()
elem.send_keys("224893")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.implicitly_wait(5)
resultsPage = driver.current_url
driver.get(resultsPage)
driver.get("http://206.50.6.81/WebtrakWT/shipinquiry/ShipInfo.aspx?
OrderNo=26198&Back=ShipLookup&TrackType=HousebillNo&TrackNo=224893")
elem = driver.find_element_by_xpath("//*[@id='Table5']")
print (elem)
driver.close()
答案 0 :(得分:2)
您应该接受Jens Dibbern的解决方案,但只是想指出,一旦您的网址传入,您还可以使用熊猫来提取该表进行解析:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
driver = webdriver.Chrome()
driver.get("http://www.dovelogistics.com/track-shipment/")
elem = driver.find_element_by_name("txtInputNo")
elem.clear()
elem.send_keys("224893")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.implicitly_wait(5)
driver.switch_to.window(driver.window_handles[1])
resultsPage = driver.current_url
driver.get(resultsPage)
html = driver.page_source
tables = pd.read_html(html)
table = tables[5]
driver.close()
输出:
print (table)
0 1
0 Status Updated On Note
1 12/4/2018 1:07:00 PM Shipment Status changed to: Rated
2 12/4/2018 1:07:00 PM Signed for By: Delivered at KWA 1:07:00 PM 12/...
3 12/4/2018 9:37:43 AM Email Status Notification Sent
4 12/2/2018 11:50:00 AM Shipment Status changed to: Shipment Departed
5 12/1/2018 2:12:00 PM Shipment Status changed to: Shipment Arrived
6 12/1/2018 10:39:00 AM Shipment Status changed to: Shipment Departed
7 12/1/2018 9:28:00 AM Shipment Status changed to: Shipment Arrived
8 11/30/2018 2:53:55 PM Shipment Status changed to: Shipment Departed ...
9 11/28/2018 8:42:23 PM Shipment Status changed to: On-Hand At Origin ...
10 11/28/2018 5:53:47 PM Shipment Status changed to: Dispatched for Pickup
答案 1 :(得分:1)
该网站将打开另一个窗口。您必须切换到该窗口。他们不止一次使用表ID标记。这应该有帮助:
driver = webdriver.Chrome()
driver.get('http://www.dovelogistics.com/track-shipment/')
elem = driver.find_element_by_name("txtInputNo")
elem.clear()
elem.send_keys("224893")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.switch_to.window(driver.window_handles[1])
print(driver.current_url)
elem = driver.find_elements_by_id("Table1")
print (elem)
driver.close()
您必须处理其嵌套表和重复的id标签。