我们的网站上有一个链接指向zip文件夹。因此,HTML文件中的行显示如下:
<p><a href="Data/WillCounty_AddressPoint.zip">Address Points</a> (updated weekly)</p>
将很快使用当前日期更改zip文件夹的名称,使其如下所示:
WillCounty_AddressPoint_02212018.zip
如何更改HTML中的相应行?
使用this回答我有一个脚本。它运行时没有错误,但不会更改HTML文件中的任何内容。
import bs4
from bs4 import BeautifulSoup
import re
import time
data = r'\\gisfile\GISstaff\Jared\data.html' #html file location
current_time = time.strftime("_%m%d%Y") #date
#load the file
with open(data) as inf:
txt = inf.read()
soup = bs4.BeautifulSoup(txt)
#create new link
new_link = soup.new_tag('link', href="Data/WillCounty_AddressPoint_%m%d%Y.zip")
#insert it into the document
soup.head.append(new_link)
#save the file again
with open (data, "w") as outf:
outf.write(str(soup))
答案 0 :(得分:0)
这是使用BeautifulSoup替换href属性的方法。
from bs4 import BeautifulSoup
import time
data = r'data.html' #html file location
#load the file
current_time = time.strftime("_%m%d%Y")
with open(data) as inf:
txt = inf.read()
soup = BeautifulSoup(txt, 'html.parser')
a = soup.find('a')
a['href'] = ("WillCounty_AddressPoint%s.zip" % current_time)
print (soup)
#save the file again
with open (data, "w") as outf:
outf.write(str(soup))
输出:
<p><a href="WillCounty_AddressPoint_02212018.zip">Address Points</a> (updated weekly)</p>
并写入文件
更新以使用提供的文件中的数据。
from bs4 import BeautifulSoup
import time
data = r'data.html' #html file location
#load the file
current_time = time.strftime("_%m%d%Y")
with open(data) as inf:
txt = inf.read()
soup = BeautifulSoup(txt, 'html.parser')
# Find the a element you want to change by finding it's text and selecting parent.
a = soup.find(text="Address Points").parent
a['href'] = ("WillCounty_AddressPoint%s.zip" % current_time)
print (soup)
#save the file again
with open (data, "w") as outf:
outf.write(str(soup))
然而,它会删除空白行,否则就会保留您的HTML代码。
使用diff工具查看原始文件和修改文件的差异:
diff data\ \(copy\).html data.html
77c77
< <p><a href="Data/WillCounty_AddressPoint.zip">Address Points</a> (updated weekly)</p>
---
> <p><a href="WillCounty_AddressPoint_02222018.zip">Address Points</a> (updated weekly)</p>
116,120d115
<
<
<
<
<
154d148
<