I have been scraping a website that requires login and its not getting all the information to require. So I thought it would be best to go back to the start and show all the html it has pulled from the page
how could i do this? below is my initial idea but what am i missing to allow me to debug?
browser.get('http://www.racingpost.com' + link)
tree = html.fromstring(browser.page_source)
print(tree)
答案 0 :(得分:2)
Well, you can print out the browser.page_source
once again:
print(browser.page_source)
If the browser was closed after getting the .page_source
, you can remember it into a variable and print it out later:
browser.get('http://www.racingpost.com' + link)
# ...
source = browser.page_source
browser.close()
print(source)
Or, you can dump the tree back to string via .tostring()
:
print(html.tostring(tree))
It also has the pretty-printing implemented:
print(html.tostring(tree, pretty_print=True))