我有以下python脚本,我想在其中对图像src路径进行一些修改,然后将其写回到同一文件中:
#!/usr/local/bin/python
# -*- coding: utf-8 -*-
import os
import sys
import re
from operator import itemgetter
import csv
from BeautifulSoup import BeautifulSoup as BSHTML
azure = ['cdn.core.windows.net', 'blob.core.windows.net']
walk_dir = ["www", "sites"]
image_paths = []
for x in walk_dir:
for root, dirs, files in os.walk(x, topdown=False):
for filename in files:
file_path = os.path.join(root, filename)
with open(file_path, 'rb') as f:
f_content = f.read()
if filename.endswith('.html'):
soup = BSHTML(f_content)
images = soup.findAll('img')
print(filename, file_path)
for image in images:
try:
image_src = image['src'].split('?')
print(image_src)
image_paths.append(image_src[0])
except:
pass
目前尚不清楚最好的方法是现在采用image_src
并将其重写,然后在更新完所有图像后最后保存文件吗?
任何建议都值得赞赏