使用硒在一个for循环中遍历多个元素

时间:2019-03-31 09:25:48

标签: python selenium for-loop

我正在尝试通过网站上的多个容器抓取内容,以查看是否存在某个项目。我想比较一个特定的值,如果找到一个具有该值的项目,它将在该项目中写入该项目的价格以及在CSV文件中指向何处购买的链接。

我设法制作了一个for循环,循环遍历我要匹配的值,但是我无法弄清楚如何使用它来提取其他需要的元素。最终返回页面上第一个容器的值,而不是匹配的值。

我试图将它们放在for循环的内部以及外部。我意识到它不起作用,因为他们只找到一个元素,并且没有被告知要从哪个容器中拉出它,但是我在其他脚本中做了类似的操作,因此效果很好。

我也尝试了在彼此之间嵌套循环,但是出于明显的原因,它们也没有解决。处理这种情况的最佳方法是什么?

values = WebDriverWait(driver, 2).until(EC.presence_of_all_elements_located((By.XPATH, "//*[contains(@class,'text-center') and contains(text(),'Wear:')]")))
price = driver.find_element_by_class_name("item-price-display").text
buy_link = driver.find_element_by_css_selector("a.btn-xs").get_attribute('href')
print(len(values))
for value in values:
    wear = value.text.replace("Wear: ", "")
    print(wear)
    if wear == condition:    
        print(buy_link,price)
        f.write(buy_link + "," + price)
        break

完整代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
profile = webdriver.FirefoxProfile()
profile.set_preference("permissions.default.image", 2) # Block all images to load websites faster.
driver = webdriver.Firefox(firefox_profile=profile)
f =  open("file.csv",'r+')
url = "http://bitskins.com"
driver.get(url)
elem = driver.find_element_by_name("market_hash_name")
key = "Dragon Lore"
condition = "0.11940288"
elem.send_keys(key,Keys.RETURN)
import time
time.sleep(3)
values = WebDriverWait(driver, 2).until(EC.presence_of_all_elements_located((By.XPATH, "//*[contains(@class,'text-center') and contains(text(),'Wear:')]")))
print(len(values))
for value in values:
    price = driver.find_element_by_class_name("item-price-display").text 
    buy_link = driver.find_element_by_css_selector("a.btn-xs").get_attribute('href')
    wear = value.text.replace("Wear: ", "")
    print(wear)
    if wear == condition:

        print(buy_link,price)
        f.write(buy_link + "," + price)
        break

预期结果:(此外,我试图找出如何选择第四个按钮,而不是添加到购物车旁边的第一个按钮。)

https://bitskins.com/view_item?app_id=730&item_id=14983017710 $ 1,355.23

我得到的结果:

https://steamcommunity.com/profiles/76561198380422063/inventory/#730_2_15685089707 $ 1,350.00

1 个答案:

答案 0 :(得分:1)

问题是pricebuy_link是页面中的第一个元素,与您使用values获得的 Wear 不相关。请参阅下面的代码中的注释。

要获取第4个按钮,可以使用.item-solo a:nth-child(4) CSS选择器。要在项目循环内使用以下代码:

shareable_link = item.find_element_by_css_selector("a:nth-child(4)")

完整代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import re

url = "http://bitskins.com"
key = "Dragon Lore"
condition = "0.11940288"

profile = webdriver.FirefoxProfile()
profile.set_preference("permissions.default.image", 2) # Block all images to load websites faster.
driver = webdriver.Firefox(firefox_profile=profile)
wait = WebDriverWait(driver, 10)

f = open("file.csv", 'r+')

driver.get(url)
wait.until(EC.element_to_be_clickable((By.NAME, "market_hash_name"))).send_keys(key, Keys.RETURN)

# get all sale item container elements
items = wait.until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "item-solo")))
print(len(items))

for item in items:
    # price, buy_link and wear elements are child of sale items
    price = item.find_element_by_class_name("item-price-display").text
    buy_link = item.find_element_by_css_selector("a.btn-xs").get_attribute('href')
    shareable_link = item.find_element_by_css_selector("a:nth-child(4)").get_attribute('href')

    wear = item.find_element_by_xpath("descendant::div[contains(@class,'text-center') and contains(text(),'Wear:')]").text
    wear = re.search("\\d+.\\d+", wear)[0]
    print(wear)

    if wear == condition:
        print(buy_link, price)
        f.write(f"{buy_link},{price}")
        break

对于another而言,抓取库是更轻松,更快,资源更少的解决方案。