当前,如果使用标头,则运行代码-> print list =空,但如果我不使用标头,则-> print list =包含数据,但是如果使用503 Server Error: Service Unavailable
会出错。我不明白为什么要使用标题然后列表=空。
谢谢帮助我
import bs4
import requests
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 7.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0"}
res = requests.get('https://www.amazon.com/dp/B07GMXQN8X', headers=headers)
soup = bs4.BeautifulSoup(res.content,'html.parser')
a = soup.find_all('p')
print(a)
答案 0 :(得分:-1)
您需要使用res.text
而不是res.content
。另外,您可以尝试更改parser:
import bs4
import requests
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 7.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0"}
res = requests.get('https://www.amazon.com/dp/B07GMXQN8X', headers=headers)
soup = bs4.BeautifulSoup(res.text, 'html5lib')
# soup = bs4.BeautifulSoup(res.text, 'lxml')
a = soup.find_all('p')
print(a)
输出:
[<p>Sponsored Products are advertisements for products sold by merchants on Amazon.com. When you click on a Sponsored Product ad, you will be taken to an Amazon detail page where you can learn more about the product and purchase it.</p>, <p> To learn more about Amazon Sponsored Products,<span class="a-letter-space"></span> <a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp" rel="noopener" target="_blank" title="click here">click here</a>. </p>, <p class="a-spacing-small a-size-small a-color-secondary">
Find answers in product info, Q&As, reviews
</p>, <p class="a-spacing-base a-spacing-top-base a-color-error askError askBadQuestionError">
Please make sure that you are posting in the form of a question.
</p>, <p>Sponsored Products are advertisements for products sold by merchants on Amazon.com. When you click on a Sponsored Product ad, you will be taken to an Amazon detail page where you can learn more about the product and purchase it.</p>, <p> To learn more about Amazon Sponsored Products,<span class="a-letter-space"></span> <a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp" rel="noopener" target="_blank" title="click here">click here</a>. </p>, <p class="nav_p nav-bold">There's a problem loading this menu right now.</p>, <p class="nav_p"><a class="nav_a" href="/gp/prime/ref=nav_prime_ajax_err/145-9045450-0196650">Learn more about Amazon Prime.</a></p>]