使用window.performance.getEntries()
检索网络数据时看到不同的结果
这是代码:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=options)
urls = ['https://stackoverflow.com/','https://www.google.com/']
for url in urls:
driver.get(url)
image_name = url.split(".")[1] + ".png"
driver.save_screenshot(image_name)
performance_data = driver.execute_script('return window.performance.getEntries();')
for single_data in performance_data:
file = open('Hero.txt', 'a')
files = open('Heroes.txt', 'a')
files.write(str(single_data["name"]))
if "stack" in single_data["name"]:
file.write(url + "stack_code 1")
break
if "stack" not in single_data["name"]:
file.write(url + "stack_code 0")
break
如果删除最后一个if
语句,则会在Heroes.txt中获得所有网络呼叫名称。因此,该代码可正确填充第一个if
。如果我添加第二个if
:
if "stack" not in single_data["name"]:
file.write(url + "stack_code 0")
break
我在Heroes.txt中得到了这个
https://stackoverflow.com/https://www.google.com/https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_120x44dp.pnghttps://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.pnghttps://consent.google.com/status?continue=https://www.google.com&pc=s×tamp=1534340797&gl=GBhttps://ssl.gstatic.com/gb/images/i2_2ec824b0.pngfirst-paintfirst-contentful-painthttps://www.google.com/gen_204?s=webhp&t=aft&atyp=csi&ei=vS50W_aGHKSalwSCgrLQCQ&rt=wsrt.107,aft.119,prt.119https://www.google.com/images/nav_logo242.pnghttps://www.google.com/xjs/_/js/k=xjs.s.en_GB.LsN8oH7x4FY.O/m=sx,sb,cdos,elog,hsm,jsa,r,d,csi/am=YBZhP_4BJP-_YEBRsBWMsMAMCoZN/rt=j/d=1/dg=0/rs=ACT90oHhp5AGyfczYrjBNR_VenselWZSnAhttps://www.google.com/images/branding/product/ico/googleg_lodp.icohttps://www.google.com/xjs/_/js/k=xjs.s.en_GB.LsN8oH7x4FY.O/am=YBZhP_4BJP-_YEBRsBWMsMAMCoZN/rt=j/d=1/exm=sx,sb,cdos,elog,hsm,jsa,r,d,csi/ed=1/dg=0/rs=ACT90oHhp5AGyfczYrjBNR_VenselWZSnA/m=aa,abd,async,bgd,dvl,foot,ipv6,lu,m,mu,sf,sonic,spch,cbin,tnqaT,cbhb,xz7cCd,fEVMic,WgDvvc?xjs=s1https://www.google.com/gen_204?atyp=csi&ei=vS50W_aGHKSalwSCgrLQCQ&s=webhp&t=all&imc=3&imn=3&imp=0&adh=&conn=onchange&ima=1&ime=0&imeb=0&imeo=0&mem=ujhs.10,tjhs.14,jhsl.2330,dm.8&net=dl.10000,ect.4g,rtt.0&sys=hc.4&rt=aft.118,dcl.121,iml.118,ol.137,prt.118,xjs.297,xjsee.297,xjses.222,xjsls.138,wsrt.107,cst.15,dnst.0,rqst.81,rspt.9,sslt.13,rqstt.17,unt.1,cstt.2,dit.228&zx=1534340797851https://www.google.com/textinputassistant/tia.pnghttps://www.google.com/async/bgasy?ei=vS50W_aGHKSalwSCgrLQCQ&yv=3&async=_fmt:jspbhttps://www.google.com/xjs/_/js/k=xjs.s.en_GB.LsN8oH7x4FY.O/am=YBZhP_4BJP-_YEBRsBWMsMAMCoZN/rt=j/d=1/exm=sx,sb,cdos,elog,hsm,jsa,r,d,csi,aa,abd,async,bgd,dvl,foot,ipv6,lu,m,mu,sf,sonic,spch,cbin,tnqaT,cbhb,xz7cCd,fEVMic,WgDvvc/ed=1/dg=0/rs=ACT90oHhp5AGyfczYrjBNR_VenselWZSnA/m=RMhBfe?xjs=s2https://www.gstatic.com/og/_/js/k=og.og2.en_US.gQBLNoMk7Q0.O/rt=j/m=def/exm=in,fot/d=1/ed=1/rs=AA2YrTuPdnXARx6L0IfRJ8krP-HTrx9fswhttps://www.google.com/gen_204?atyp=i&ei=vS50W_aGHKSalwSCgrLQCQ&vet=10ahUKEwi22cnxmO_cAhUkzYUKHQKBDJoQsmQIDQ..s&zx=1534340797919https://adservice.google.com/adsid/google/uihttps://www.google.com/gen_204?atyp=i&ct=&cad=udla=3&ei=vS50W_aGHKSalwSCgrLQCQ&e=12&zx=1534340797933https://www.google.co.uk/domainless/read?igu=1https://www.google.com/js/bg/5KdFGiZjrMqKMsWhJOuJJel3qQCRBLUAy7GSORuI-sg.jshttps://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.yK0z3MKtgaU.O/m=gapi_iframes,googleapis_client,plusone/rt=j/sv=1/d=1/ed=1/rs=AHpOoo-SafOYj4n3budMysbWxppU-lxJeg/cb=gapi.loaded_0https://www.google.com/domainless/write?igu=1&data=&xsrf=ALAmJdGvY5TXvkklyYKZuaWBzGhopICz3A:1534340797490https://www.google.com/gen_204?atyp=i&ct=&cad=udla=3&ei=vS50W_aGHKSalwSCgrLQCQ&pd=105&e=2&zx=1534340798039https://www.google.com/gen_204?atyp=i&ct=&cad=udla=1&ei=vS50W_aGHKSalwSCgrLQCQ&act=p&ps=2&zx=1534340798039
我添加第二个if
后,就会在Heroes.txt中得到它:
https://stackoverflow.com/https://www.google.com/
有什么想法吗?我在做傻事吗?
答案 0 :(得分:0)
当您插入第二个if
条件时,代码编写不同的原因是,有时第二个条件会被满足,因此break
从for
循环中迭代{ {1}}。如果您删除了第二个performance_data
条件,那么当if
时将没有'stack' not in single_data["name"]
,您将继续进行break
循环的下一个迭代并继续写入到for
。
当您添加第二条Heroes.txt
语句时,您只有2个值写入if
,因为从Heroes.txt
循环中保证了break
,但是您正在遍历2个URL。
当您删除第二条for
语句时,并不总是保证从if
循环中获得break
,因此(通常)为您提供了更多写入{{1 }}。
这是逻辑问题,而不是Selenium或Python问题。如果您可以告诉我们您的预期输出是什么,或者您想要写入什么文件,我们可以帮助您构建代码以实现该预期。
根据我的猜测,您可能想要这样的东西(我删除了一些本文中未使用的代码):
for
Heroes.txt
的结果:
from selenium import webdriver
driver = webdriver.Chrome()
urls = ['https://stackoverflow.com/','https://www.google.com/']
for url in urls:
driver.get(url)
performance_data = driver.execute_script('return window.performance.getEntries();')
pass_flag = False
for single_data in performance_data:
# As opposed to breaking or continuing, we're just going to pass over
# to the next bit of code where we write to Heroes.txt after we've
# written to Hero.txt once per URL
if pass_flag:
pass
else:
if 'stack' in single_data['name']:
file = open('Hero.txt', 'a')
file.write(url + 'stack_code 1')
if 'stack' not in single_data['name']:
file = open('Hero.txt', 'a')
file.write(url + 'stack_code 0')
pass_flag = True
# Unlike the above, we're *always* going to write to Heroes.txt
files = open('Heroes.txt', 'a')
files.write(str(single_data['name']))
Hero.txt
的结果:
https://stackoverflow.com/stack_code 1https://www.google.com/stack_code 0
我遵循的是与您在上述解决方案中的程序中使用的结构相同的结构。以下是我的处理方式:
Heroes.txt