使用python,我有此代码
import requests
from bs4 import BeautifulSoup
import json
links = [
'https://www.ncaa.com/scoreboard/volleyball-women/d1/2018/09/17/all-conf',
'https://www.ncaa.com/scoreboard/volleyball-women/d1/2018/12/15/all-conf'
]
data = []
for link in links:
req_data = requests.get(link)
soup = BeautifulSoup(req_data.text, 'html.parser')
for a in soup.find_all('a'):
values = [span.text for span in a.find_all('span', {'class':'gamePod-game-team-name'})]
if len(values) > 0:
data.append(values)
print(*data, sep = "\n")
with open('test.json', 'w') as f:
json.dump(data, f)
哪个给我这个结果:
['James Madison', 'VCU']
['Nebraska', 'Stanford']
我还希望从相同的网页上抓取currentDate。 在页面上的哪个位置住:
<script type="application/json" data-drupal-selector="drupal-settings-json">
我被困在正确地刮擦currentDate上。 这是我目前拥有的:
import requests
from bs4 import BeautifulSoup
import re
import json
res = requests.get('https://www.ncaa.com/scoreboard/volleyball-women/d1/2018/09/17')
soup = BeautifulSoup(res.text, 'html.parser')
script = soup.find_all("script", type="application/json", path="currentDate")
理想情况下,我想将该currentDate结果放在我的数据结果旁边。任何建议表示赞赏。