我想从站点获取数据,但是响应仅显示错误。
我尝试将url更改为http,这导致405错误,并尝试将data = json.dumps(data)更改为data = data,但它们均无效。
import requests
import json
from bs4 import BeautifulSoup
request_url = 'https://www.kinds.or.kr/news/newsResult.do'
data = {"jsonSearchParam": {"indexName": "news", "searchKey": "sky", "searchKeys": [{}], "byLine": "", "searchFilterType": "1", "searchScopeType": "1", "mainTodayPersonYn": "", "startDate": "2019-05-06", "endDate": "2019-08-06", "newsIds": [
], "categoryCodes": [], "incidentCodes": [], "networkNodeType": "", "topicOrigin": ""}, "index-name": "news", "N": "", "search-keyword": "sky", "search-index-type": "news", "dict-type": "texanomy", "dict-concat": "OR"}
response = requests.post(request_url, data=json.dumps(data))
html = response.text
soup = BeautifulSoup(html, 'html.parser')
flist = soup.find_all('span')
print(response)
我希望得到适当的答复。
答案 0 :(得分:0)
您的网址似乎不正确。使用Firefox开发人员工具,用于搜索的正确URL是'https://www.kinds.or.kr/v2/news/search.do'
。参数“ jsonSearchParam”必须为json字符串,因此我们在其上使用json.dumps()
:
import json
import requests
from bs4 import BeautifulSoup
# request_url = 'https://www.kinds.or.kr/news/newsResult.do'
request_url = 'https://www.kinds.or.kr/v2/news/search.do' # <-- correct URL
d = {"indexName": "news", "searchKey": "sky", "searchKeys": [{}], "byLine": "", "searchFilterType": "1", "searchScopeType": "1", "mainTodayPersonYn": "", "startDate": "2019-05-06", "endDate": "2019-08-06", "newsIds": [], "categoryCodes": [], "incidentCodes": [], "networkNodeType": "", "topicOrigin": ""}
data = {"jsonSearchParam": json.dumps(d), "index-name": "news", "N": "", "search-keyword": "sky", "search-index-type": "news", "dict-type": "texanomy", "dict-concat": "+OR+"}
response = requests.post(request_url, data=data)
print(response)
soup = BeautifulSoup(response.text, 'lxml')
flist = soup.find_all('span')
print(flist)
打印:
<Response [200]>
[<span class="sr-only">Toggle navigation</span>, <span class="icon-bar"></span>, <span class="icon-bar"></span>, <span class="icon-bar"></span>, <span aria-hidden="true">×</span>, <span class="input-group-addon">
<i class="fal fa-envelope"></i>
</span>, <span class="input-group-addon">
...and so on.