Question

我试图点击一些图片链接，然后检索网址并将其保存为jpgs。我一直在想要检索的图片网址之前检索网页的网址。

它保存带有时间戳的jpgs文件，但是没有图像，因为我得到了错误的网址。

ts = time.time()
WebDriverWait(driver, 100).until(
EC.presence_of_element_located((By.XPATH, "/html/head/meta")))

img_url = driver.current_url


print img_url

urllib.urlretrieve(img_url, "/home/ro/A_Python_Scripts/tumblrr_auto/Pics/test_pics/%d.jpg" %(ts))

当我点击链接时，我会得到一些像这样的HTML。

<html>
<head>
<meta name="viewport" content="width=device-width; height=device-height;">
<link rel="stylesheet" href="resource://gre/res/ImageDocument.css">
<link rel="stylesheet" href="resource://gre/res/TopLevelImageDocument.css">
<link rel="stylesheet" href="chrome://global/skin/media/TopLevelImageDocument.css">
<title>3760968-1135171246-cc%5B.jpg (JPEG Image, 704 × 400 pixels) - Scaled (91%)</title>

我可以让它与隐式等待一起工作但我需要一个明确的等待。

Answer 1

我认为问题在于您正在等待META标记存在。它可能已经存在于您开始的页面上，因此没有等待并继续执行，这将获得当前页面的URL。

在这种情况下，我通常做的是两件事之一：

要么等待第二页上的特定（唯一）元素......第二页上存在的内容，而不是第一页上的内容。

或

等待第一页上的元素过时（表示浏览器正在更改页面），然后获取我想要的元素（来自第二页）

我对您的网页提供的具体代码知之甚少，但这里有一些例子

WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "some id that exists on page2 but not on page1")))
// once the line above passes, I know I'm on the second page... do stuff

或

WebDriverWait(driver, 10).until(EC.staleness_of((By.ID, "some id that exists on page1 but not on page2")))
// once the line above passes, I know I'm transitioning to the second page... do stuff...
// may need to wait for an element on the 2nd page to exist, be clickable, etc.

上面的示例使用的是ID，但这些ID可能不可用，因此您可以将它们更改为适用的任何ID。

如何使用此代码在selenium中进行显式等待？

1 个答案: