我正在尝试使用http请求发布在python中创建脚本,以将该 pdf 文件上传到网页中。我已经尝试过以下操作,但是很遗憾,脚本无法上传文件。
这是log-in链接。这是用户名 SmthShift_123
和密码 7/B!yzRd8wuK!N2
供您考虑。现在转到this page,然后单击最后一个标签Anhang
,您将在其中找到上传选项。
为了让您形象化- this 是该页面的外观。
这是我到目前为止的尝试:
import requests
from bs4 import BeautifulSoup
login_url = 'https://jobs.commerzbank.com/index.php?ac=login'
application_link = 'https://jobs.commerzbank.com/index.php?ac=application&jobad_id=30670'
target_link = 'https://jobs.commerzbank.com/index.php?ac=application&page=6'
upload_link = 'https://jobs.commerzbank.com/inc/candidate_attachments.php'
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36'
res = s.get(login_url)
sauce = BeautifulSoup(res.text,"lxml")
elem = {i['name']:i.get('value','') for i in sauce.select('input[name]')}
elem['username'] = 'SmthShift_123'
elem['password'] = '7/B!yzRd8wuK!N2'
s.post(login_url,data=elem)
s.get(application_link)
resp = s.get(target_link)
soup = BeautifulSoup(resp.text,"lxml")
payload = {i['name']:i.get('value','') for i in soup.select('input[name]')}
payload['form-control'] = 'Anschreiben'
payload['upload'] = 'Datei hochladen'
payload['save'] = ''
files = {
'searchButton': open('CV.pdf','rb')
}
s.post(upload_link,files=files,data=payload)
执行上述脚本时,它既不保存该文件也不引发任何错误。
我也这样尝试过(仅使用硒进行上传),但是脚本也无法选择并上传文件:
s.post(login_url,data=elem)
s.get(application_link)
resp = s.get(target_link)
driver = webdriver.Chrome()
driver.get(resp.url)
driver.delete_all_cookies()
for cookie in s.cookies.items():
driver.add_cookie({"name": cookie[0], "value": cookie[1]})
driver.get(resp.url)
select = Select(WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "select#upload_category"))))
select.select_by_visible_text("Lebenslauf")
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "input#upload_file"))).send_keys("C://Users/WCS/Desktop/CV.pdf")
如何使用请求选择和上传pdf文件?
答案 0 :(得分:1)
我可以使用硒上传它。这个网站很棘手。它有一个隐藏的Message must not be null
,仅在将按钮悬停以进行上传时才会显示。
尝试一下:
input
希望,这也将为您工作。祝你好运!
答案 1 :(得分:1)
解决方案1
js file
具有fileno
功能,用于上传附件文件。
find(name, attrs, recursive, text, **kwargs)
-匹配并返回第一个对象。
例如。
attachFfwAjaxUpload()
解决方案2-通过硒上传文件
import requests
from bs4 import BeautifulSoup
login_url = 'https://jobs.commerzbank.com/index.php?ac=login'
application_link = 'https://jobs.commerzbank.com/index.php?ac=application&jobad_id=30670'
target_link = 'https://jobs.commerzbank.com/index.php?ac=application&page=6'
upload_link = 'https://jobs.commerzbank.com/inc/candidate_attachments.php'
with requests.Session() as sessionObj:
sessionObj.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36'
res = sessionObj.get(login_url)
sauce = BeautifulSoup(res.text,"lxml")
elem = {i['name']:i.get('value','') for i in sauce.select('input[name]')}
elem['username'] = 'SmthShift_123'
elem['password'] = '7/B!yzRd8wuK!N2'
sessionObj.post(login_url,data=elem)
sessionObj.get(application_link)
resp = sessionObj.get(target_link)
soup = BeautifulSoup(resp.text,"lxml")
# get attachment form tag object
form = soup.find("form", attrs={'action':'index.php'})
payload = dict()
# set upload category
# you have four category option, values are 2, 1, 4 and 12,
# select one value option
payload['category'] = '12'
payload['application_token'] = form.find('input',
attrs={'name':'application_token'}).get('value','')
payload['action'] = 'upload'
# you can see upload file attachment attachFfwAjaxUpload() function in
# frontend.min.js file in browser source tab between 38878 to 38903 lines
print(payload)
with open('CV.pdf', 'rb') as f:
file = {"attachment": f}
atteachment_response = sessionObj.post(upload_link, files=file, data=payload)
# print post request response status code
print(atteachment_response.status_code)
print(atteachment_response.text)