使用selenium,python和whatsapp-web等待元素显示和不显示

时间:2019-01-18 00:23:40

标签: python-3.x selenium selenium-webdriver whatsapp webdriverwait

我正在尝试使用“ whatsapp-web”,“ selenium”和“ python 3”来了解whatsapp用户何时联机或脱机。

要解释更多,这就是我希望脚本如何工作:

该脚本将侦听要显示的跨度(带有title = online),当显示跨度(这意味着用户上线)时,我希望此时打印该时间,然后该脚本将保持再次监听跨度消失,消失时脚本打印消失的时间,依此类推。

这是我的代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import datetime

driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
driver.get('https://web.whatsapp.com/')

# do nothing until QR code scanned and whatsapp-web is accessed
input('Enter anything after scanning QR code')

# Input the name of the user to track
name = input('Enter the name of the user : ')

# find the whatsapp user to be tracked then a click to enter the conversation
user = driver.find_element_by_xpath("//span[@title = '{}']".format(name))
user.click()

while True:
   # in the conversation page, a span with title online is diplayed when user is online.
   #the web driver will wait 8hrs=28800s, if user not online all this time script will be killed by webdriverWait
   element = WebDriverWait(driver, 28800).until(
      EC.visibility_of_element_located(
         (By.XPATH, "//span[@title = 'online']")))

   #Moment the user came online
   now = datetime.datetime.now()
   print("online at : ")
   print(now.strftime("%H:%M:%S"))

   element = WebDriverWait(driver, 28800).until(
      EC.invisibility_of_element_located(
         (By.XPATH, "//span[@title = 'online']")))

   #Moment the user went offline
   now = datetime.datetime.now()
   print("offline at : ")
   print(now.strftime("%H:%M:%S"))
   print("************")

我的脚本有效,但是, 我希望它可以运行几个小时,例如8个小时或更长时间,但是我读到使用WebDriverWait的秒数很高(在我的情况下为28800s),这是一个不好的做法。 >

那么还有其他更好的方法可以实现这一目标吗?

我如何将输出写入txt或word文件?

有什么建议可以改善我的代码?

如何防止CPU撞击?或可能发生的任何可能的问题

2 个答案:

答案 0 :(得分:1)

WebDriverWait只不过是a (quite) fancy while/catch/sleep loop;在特殊情况下,由于一个简单的原因,您可能希望自己复制它-它每500毫秒轮询一次,这对于该任务来说可能太详细了。它还使您免受更多粒度的控制。

这是您自己执行逻辑的方法-布尔变量是用户在线还是不在线;根据其值,检查元素是否可见(.is_displayed()),睡眠X时间并重复。异常NoSuchElementExceptionStaleElementReferenceException将被视为离线用户/布尔值false。

最后,您的代码将非常接近WebDriverWait中的逻辑-仍然是您的代码,并且在需要时更加灵活。


或者,只需在当前代码的WebDriverWait中传递更大的内部轮询-这就是调用的poll_frequency参数:)

WebDriverWait(driver, 28800, 5)  # the value is in seconds

我不知道您在何处以及所读的内容,使用WebDriverWait花费大量的秒数是一个坏习惯;正如您在其代码中所看到的,只是该方法可以运行多少时间。
我认为建议的基调是“以很高的秒数使用WebDriverWait是一种不好的做法,因为如果在X秒内未满足该条件,它将永远无法满足,并且您的代码只会旋转。”。实际上这是您想要的行为:)

我也不必担心要给CPU加税-这些检查非常轻巧,无害。对于这么大的运行时,让我担心的是浏览器本身的内存泄漏;)


关于优化代码-我要做的是减少语句重复;缺点是降低了其可读性。我的观点:

user_online = False

while True:
    # we'll be checking for the reverse of the last status of the user
    check_method = EC.visibility_of_element_located if not user_online else EC.invisibility_of_element_located

    # in the conversation page, a span with title online is diplayed when user is online.
    # the web driver will wait 8hrs=28800s for the user status to change all
    # the script will be killed by webdriverWait if that doesn't happen
    element = WebDriverWait(driver, 28800, 5).until(
            check_method((By.XPATH, "//span[@title = 'online']")))

    # The moment the user changed status
    now = datetime.datetime.now().strftime("%H:%M:%S")
    print("{} at : {}".format('online' if not user_online else 'offline', now))   # if you're using python v3.6 or more, the fstrings are much more convenient for this
    print("************")

    user_online = not user_online   # switch, to wait for the other status in the next cycle

最后,从代码角度来说-脚本不能“无休止地”运行。为什么?因为如果用户在8小时内未更改状态,则WebDriverWait将停止。要解决这个问题,请将循环体包装在try / except中:

from selenium.common.exceptions import TimeoutException  # put this in the beginning of the file

while True:
    try:
        # the code from above
    except TimeoutException:
        # the status did not change, repeat the cycle
        pass

写入文件

您可能希望to readbit如何to do that-这是一个非常简单的操作。

这里是一个示例-打开一个文件进行追加(这样就保留了以前的日志),并包裹了while循环:

with open("usermonitor.log", "a") as myfile:
    while True:
        # the other code is not repaeted for brevity
        # ...
        output = "{} at : {}".format('online' if not user_online else 'offline', now)
        print(output)
        myfile.write(output + "\n")  # this will write (append as the last line) the same text in the file
        # write() does not append newlines by itself - you have to do it yourself

答案 1 :(得分:0)

我应该建议的一件事是,在您的程序中,每次执行此程序时都需要扫描 whatsapp QR,只需替换此行

driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')

有了这个


driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe', options="user-data-dir=C:\\Users\\<username>\\AppData\\Local\\Google\\Chrome\\User Data\\whtsap")

这样您就需要扫描二维码,但只需扫描一次。