Question

我的困境是，如果我使用

a=[]
a=driver.find_elements_by_class_name("card")
random.shuffle(a)
for card in a:
    nextup=(str(card.text) + '\n' + "_" * 15)
    do a bunch of stuff that takes about 10 min

第一轮工作，但后来我得到一个StaleElementException，因为它点击链接并转到差异页面。所以我转到了这个：

a=[]
a=driver.find_elements_by_class_name("card")
i=0
cardnum=len(a)
while i != cardnum:
    i += 1 #needed because first element thats found doesnt work
    a=driver.find_elements_by_class_name("card") #also needed to refresh the list
    random.shuffle(a) 
    nextup=(str(card.text) + '\n' + "_" * 15)
    do a bunch of stuff that takes about 10 min

这一个的问题是i变量，因为由于每个循环的随机播放，可以点击相同的卡。然后我添加了一个捕获来检查卡是否已被点击并继续（如果有）。听起来它有用，但可悲的是i变量计算这些，然后最终计算超过索引。我想过定期将我设置为1，但我不知道它是否会起作用。编辑：将进行无限循环，因为一旦单击所有，我将为零，它将永远不会退出。

我知道代码的工作原理已被广泛测试，然而，机器人因为不是人类和随机而被禁止。这个脚本的基础是通过一个类别列表，然后通过一个类别中的所有卡。尝试随机化类别，但类似的困境，因为刷新列表，你必须在每个循环中重新制作数组，如上面的块然后来了已经完成的类别的问题将再次点击...任何建议将不胜感激。

Answer 1

这里发生的事情是，当您与页面交互时，DOM会刷新，最终导致您存储的元素过时。

不是保留元素列表，而是保持对各个元素路径的引用，并根据需要重新获取元素：

# The base css path for all the cards
base_card_css_path = ".card"

# Get all the target elements. You are only doing this to
# get a count of the number of elements on the page
card_elems = driver.find_elements_by_css_selector(base_card_css_path)

# Convert this to a list of indexes, starting with 1
card_indexes = list(range(1, len(card_elems)+1))

# Shuffle it
random.shuffle(card_indexes)

# Use `:nth-child(X)` syntax to get these elements on an as needed basis
for index in card_indexes:
    card_css = base_card_css_path + ":nth-child({0})".format(index)
    card = driver.find_element_by_css_selector(card_css)
    nextup=(str(card.text) + '\n' + "_" * 15)
    # do a bunch of stuff that takes about 10 min

（由于显而易见的原因，上述情况未经测试）

循环遍历洗牌的WebElements数组，而不会让它们过时

1 个答案: