如何通过Beautiful Soup刮取此页面?

时间:2019-07-03 18:02:49

标签: python web web-scraping beautifulsoup

我正在尝试在此单词关联网站上获得所有关联,但我不知道要使用什么路径或选择器。

https://wordassociations.net/en/words-associated-with/hello?button=Search

import requests
from urllib.request import urlopen
from bs4 import BeautifulSoup
import lxml

url = 'https://wordassociations.net/en/words-associated-with/hello?button=Search'
page=urlopen(url)
bs = BeautifulSoup(page,"lxml")

1 个答案:

答案 0 :(得分:0)

您可以按部分搜索结果。此外,此解决方案将为每个关联结果抓取所有页面:

getContentType()

输出:

/**
 Gets single content type from Delivery service.   
 - Parameter name: The codename of a specific content type.
 - Parameter completionHandler: A handler which is called after completetion.
 - Parameter isSuccess: Result of the action.
 - Parameter contentTypes: Received content type response.
 - Parameter error: Potential error.
 */