我需要修改html文件的一部分。我设法用BeautifulSoup这样做:
def ineedhelp(path):
from bs4 import BeautifulSoup
#Retrieve htmlFiles
pages = find_files(path, '.html') #as a list
for page in pages:
stream = open(page, "rw")
soup = BeautifulSoup(stream, "lxml")
formsoup = soup.find('form', attrs={"method":u"post"})
if formsoup is not None:
action = formsoup['action']
phpScript = init_php(path, page, action) #Function that return URL as a string
##### HERE I TRY TO DO THIS ####
Something like: -formsoup['action'] = phpScript
-Save the result
stream.close()
我做的方式是我发现的唯一可行的方法,如果我尝试:
for form in soup.find('form', attrs={whatever})
我有一条错误消息,例如“无法对象无法迭代”。
答案 0 :(得分:0)
您正在迭代标记而不是数组。 find
会返回一个标记,findAll
会返回一个标记数组。
您可以更改:
for form in soup.find('form', attrs={whatever})
为:
form = soup.find('form', attrs={whatever})
或者
您可以更改:
for form in soup.find('form', attrs={whatever})
为:
form form in soup.findAll('form', attrs={whatever})
取决于有多少表格。