Question

I have been scraping a website that requires login and its not getting all the information to require. So I thought it would be best to go back to the start and show all the html it has pulled from the page

how could i do this? below is my initial idea but what am i missing to allow me to debug?

browser.get('http://www.racingpost.com' + link)
            tree = html.fromstring(browser.page_source)
            print(tree)

Answer 1

Well, you can print out the browser.page_source once again:

print(browser.page_source)

If the browser was closed after getting the .page_source, you can remember it into a variable and print it out later:

browser.get('http://www.racingpost.com' + link)
# ...
source = browser.page_source
browser.close()

print(source)

Or, you can dump the tree back to string via .tostring():

print(html.tostring(tree))

It also has the pretty-printing implemented:

print(html.tostring(tree, pretty_print=True))

Python: show all html in html.fromstring

1 个答案: