我想从变量中调用一个Beautiful Soup属性(例如class_,href,id),以便在以下函数中使用它:
脚本
from bs4 import BeautifulSoup
data='<p class="story">xxx </p> <p id="2">yyy</p> <p class="story"> zzz</p>'
def removeAttrib(data, **kwarg):
soup = BeautifulSoup(data, "html.parser")
for x in soup.findAll(tag, kwargs):
del x[???] # should be an equivalent of: del x["class"]
kwargs= {"class":"story"}
removeAttrib(data,"p",**kwargs )
print(soup)
预期结果:
<p>xxx </p> <p id="2">yyy</p> <p> zzz</p>
MYGz使用字典作为函数的参数,使用tag, argdict
解决了第一个问题。然后我在this question中找到**kwargs
(传递字典键和值)。
但我找不到del x["class"]
的方法。 如何传递“class”键?我尝试使用ckey=kwargs.keys()
然后使用del x[ckey]
,但它无效。
ps1:任何想法为什么removeAttrib(数据,“p”,{“class”:“story”})不起作用? Ps2:这是另一个主题而不是this(它不是重复的)
答案 0 :(得分:1)
您可以改为传递词典:
from bs4 import BeautifulSoup
data='<p class="story">xxx </p> <p id="2">yyy</p> <p class="story"> zzz</p>'
soup = BeautifulSoup(data, "html.parser")
def removeAttrib(soup, tag, argdict):
for x in soup.findAll(tag, argdict):
x.decompose()
removeAttrib(soup, "p", {"class": "story"})
答案 1 :(得分:1)
归功于MYGz和commandlineluser
from bs4 import BeautifulSoup
data='<p class="story">xxx </p> <p id="2">yyy</p> <p class="story"> zzz</p>'
def removeAttrib(data, tag, kwargs):
soup = BeautifulSoup(data, "html.parser")
for x in soup.findAll(tag, kwargs):
for key in kwargs:
# print(key) #>>class
x.attrs.pop(key, None) # attrs: to access the actual dict
#del x[key] would work also but will throw a KeyError if no key
print(soup)
return soup
data=removeAttrib(data,"p",{"class":"story"})