如何将String转换为BeautifulSoup对象?

时间:2016-06-23 17:01:19

标签: python beautifulsoup web-crawler html-parsing

我正在尝试抓取新闻网站,我需要更改一个参数。我用下一个代码替换了它:

public class Fighter extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {

    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_fighter);

    Intent intent = getIntent();
    Bundle bundle =
            intent.getExtras(); //line ***
    **


    TextView viewName = (TextView) findViewById(R.id.fighterName);

    viewName.setText(PersistentData.fName.get(fID));

    }

}

问题是“t”类型是字符串,而带属性的查找仅适用于类型while i < len(links): conn = urllib.urlopen(links[i]) html = conn.read() soup = BeautifulSoup(html) t = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"') n = str(t.find("div", attrs={'class':'entry cuerpo-noticias'})) print(p) 。你知道怎么把“t”转换成那种类型吗?

1 个答案:

答案 0 :(得分:7)

在解析之前,只需要替换

html = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
soup = BeautifulSoup(html, "html.parser")

请注意,解析HTML,定位元素和修改{{ 1}}实例,例如:

Tag

请注意,soup = BeautifulSoup(html, "html.parser") for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"): elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"] 是一个特殊的multi-valued attribute - 这就是我们将值设置为各个类列表的原因。

演示:

class

现在看看from bs4 import BeautifulSoup html = """ <div class="row bigbox container mi-df-local locked-single">test</div> """ soup = BeautifulSoup(html, "html.parser") for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"): elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"] print(soup.prettify()) 元素类是如何更新的:

div