我有一个使用python脚本采购的html网页,并且需要使用现有的html元素创建一个新的html文档。有没有办法做到这一点。我进行了研究,可以找到添加到现有文档中的方法,但是找不到从中创建新的html文档/页面的方法。下面是代码段,蓝色的是我要创建一个新的html页面。
任何帮助将不胜感激。
答案 0 :(得分:0)
下面是我用来创建新的html文档的代码。
from selenium import webdriver
import urllib.request,os,datetime
from bs4 import BeautifulSoup
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(executable_path=r'C:\chromedriver_win32\chromedriver.exe', chrome_options=options)
driver.implicitly_wait(10)
driver.get("https://mylink")
elems = driver.find_elements_by_css_selector("[href*=PublicInfoServlet]") #finding the weblinks(html doc) I need to edit and create new html docs
for elem in elems: #iterate through all the html weblinks found on the main webpage
abc=elem.get_attribute("href")
print(abc)
page = urllib.request.urlopen(abc)
soup = BeautifulSoup(page,'html.parser')
a=soup.find("div", {"id": "SpanPrint"}) #identify the html tag that needs to be used to create the required html document
efg = (abc.split("=", 1)[1])
hig=(efg.split('&', 1)[0])
f = open(str(hig)+'.html', 'w')
message=str(a)
f.write(message)
f.close()
# Change path to reflect file location
x = str(datetime.date.today())
b = str(datetime.datetime.now())
c = x[0:10]
d = b[11:19]
e = str(c + d).replace(':', '')
filename = 'mypath' +str(hig)+'.html' #saving the new doc at the required location.
os.rename('mypath'.html',
'mypath' +str(hig) + e + '.html')
driver.quit()