我正在尝试使用“ whatsapp-web”,“ selenium”和“ python 3”来了解whatsapp用户何时联机或脱机。
要解释更多,这就是我希望脚本如何工作:
该脚本将侦听要显示的跨度(带有title = online),当显示跨度(这意味着用户上线)时,我希望此时打印该时间,然后该脚本将保持再次监听跨度消失,消失时脚本打印消失的时间,依此类推。
这是我的代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import datetime
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
driver.get('https://web.whatsapp.com/')
# do nothing until QR code scanned and whatsapp-web is accessed
input('Enter anything after scanning QR code')
# Input the name of the user to track
name = input('Enter the name of the user : ')
# find the whatsapp user to be tracked then a click to enter the conversation
user = driver.find_element_by_xpath("//span[@title = '{}']".format(name))
user.click()
while True:
# in the conversation page, a span with title online is diplayed when user is online.
#the web driver will wait 8hrs=28800s, if user not online all this time script will be killed by webdriverWait
element = WebDriverWait(driver, 28800).until(
EC.visibility_of_element_located(
(By.XPATH, "//span[@title = 'online']")))
#Moment the user came online
now = datetime.datetime.now()
print("online at : ")
print(now.strftime("%H:%M:%S"))
element = WebDriverWait(driver, 28800).until(
EC.invisibility_of_element_located(
(By.XPATH, "//span[@title = 'online']")))
#Moment the user went offline
now = datetime.datetime.now()
print("offline at : ")
print(now.strftime("%H:%M:%S"))
print("************")
我的脚本有效,但是, 我希望它可以运行几个小时,例如8个小时或更长时间,但是我读到使用WebDriverWait的秒数很高(在我的情况下为28800s),这是一个不好的做法。 >
那么还有其他更好的方法可以实现这一目标吗?
我如何将输出写入txt或word文件?
有什么建议可以改善我的代码?
如何防止CPU撞击?或可能发生的任何可能的问题
答案 0 :(得分:1)
WebDriverWait
只不过是a (quite) fancy while/catch/sleep loop;在特殊情况下,由于一个简单的原因,您可能希望自己复制它-它每500毫秒轮询一次,这对于该任务来说可能太详细了。它还使您免受更多粒度的控制。
这是您自己执行逻辑的方法-布尔变量是用户在线还是不在线;根据其值,检查元素是否可见(.is_displayed()
),睡眠X时间并重复。异常NoSuchElementException
,StaleElementReferenceException
将被视为离线用户/布尔值false。
最后,您的代码将非常接近WebDriverWait
中的逻辑-仍然是您的代码,并且在需要时更加灵活。
或者,只需在当前代码的WebDriverWait
中传递更大的内部轮询-这就是调用的poll_frequency
参数:)
WebDriverWait(driver, 28800, 5) # the value is in seconds
我不知道您在何处以及所读的内容,使用WebDriverWait花费大量的秒数是一个坏习惯;正如您在其代码中所看到的,只是该方法可以运行多少时间。
我认为建议的基调是“以很高的秒数使用WebDriverWait是一种不好的做法,因为如果在X秒内未满足该条件,它将永远无法满足,并且您的代码只会旋转。”。实际上这是您想要的行为:)
我也不必担心要给CPU加税-这些检查非常轻巧,无害。对于这么大的运行时,让我担心的是浏览器本身的内存泄漏;)
关于优化代码-我要做的是减少语句重复;缺点是降低了其可读性。我的观点:
user_online = False
while True:
# we'll be checking for the reverse of the last status of the user
check_method = EC.visibility_of_element_located if not user_online else EC.invisibility_of_element_located
# in the conversation page, a span with title online is diplayed when user is online.
# the web driver will wait 8hrs=28800s for the user status to change all
# the script will be killed by webdriverWait if that doesn't happen
element = WebDriverWait(driver, 28800, 5).until(
check_method((By.XPATH, "//span[@title = 'online']")))
# The moment the user changed status
now = datetime.datetime.now().strftime("%H:%M:%S")
print("{} at : {}".format('online' if not user_online else 'offline', now)) # if you're using python v3.6 or more, the fstrings are much more convenient for this
print("************")
user_online = not user_online # switch, to wait for the other status in the next cycle
最后,从代码角度来说-脚本不能“无休止地”运行。为什么?因为如果用户在8小时内未更改状态,则WebDriverWait
将停止。要解决这个问题,请将循环体包装在try / except中:
from selenium.common.exceptions import TimeoutException # put this in the beginning of the file
while True:
try:
# the code from above
except TimeoutException:
# the status did not change, repeat the cycle
pass
您可能希望to read和bit如何to do that-这是一个非常简单的操作。
这里是一个示例-打开一个文件进行追加(这样就保留了以前的日志),并包裹了while
循环:
with open("usermonitor.log", "a") as myfile:
while True:
# the other code is not repaeted for brevity
# ...
output = "{} at : {}".format('online' if not user_online else 'offline', now)
print(output)
myfile.write(output + "\n") # this will write (append as the last line) the same text in the file
# write() does not append newlines by itself - you have to do it yourself
答案 1 :(得分:0)
我应该建议的一件事是,在您的程序中,每次执行此程序时都需要扫描 whatsapp QR,只需替换此行
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe')
有了这个
driver = webdriver.Chrome('C:/webdrivers/chromedriver.exe', options="user-data-dir=C:\\Users\\<username>\\AppData\\Local\\Google\\Chrome\\User Data\\whtsap")
这样您就需要扫描二维码,但只需扫描一次。