请完成以下任务 Scrape 10 pages (last page(10 th) url will be https://www.indeed.com/jobs?q=data+scientist&l=CO&start=90)并构建一个包含以下信息的pandas DataFrame 职位名称、公司名称、地点、职位描述摘要 关于关键字 Python、SQL、AWS、RESTFUL、机器学习、深度学习、文本挖掘、NLP、SAS、Tableau、Sagemaker、TensorFlow、Spark 的指标列(值为 True/False)
这是我的代码
import pandas as pd
import numpy as np
import requests
import csv
from urllib.request import Request,urlopen
from bs4 import BeautifulSoup as bsoup
url = 'https://www.indeed.com/jobs?q=data+scientist&l=CO'
jobTitle = []
companyName = []
companyLocation = []
summary = []
for i in range(10,100,10):
t = 0
if url[-2:] == 'CO':
url = url
t += 1
if len(url) == 49 and t == 1:
url += '&start='+str(i)
else:
url = url[0:49]
url += '&start='+str(i)
urls = str(url)
page = requests.get(urls)
soup = bsoup(page.text,'html.parser')
card = soup.find_all('div','slider_container')
for i in range(len(card)):
for name in card[i].find_all('span',title=True):
jobTitle.append(name['title'])
cn = card[i].find('span',"companyName").text
companyName.append(cn)
cl = card[i].find('div',"companyLocation").text
companyLocation.append(cl)
s = card[i].find('div',"job-snippet").text
summary.append(s)
DATA = {'Job':jobTitle,'Company':companyName,'Location':companyLocation,'Summary':summary}
df = pd.DataFrame(DATA)
df
我已经从中提取了职位名称、公司名称、地点、职位描述摘要,但我不知道如何做指标栏