HTML是:
<div class="_3u1 _gli _uvb" data-bt='{"id":xxxx,"rank":11,"abtest_version":null,"abtest_params":{"abtest_version":null,"origin":"A","ranker":null},"section":"main_column","owner_id":null,"sub_id":null,"browse_location":null,"query_data":[],"is_headline":false}'>
我的代码是:
for profileid in soup.find_all("div","_3u1 _gli _uvb"):
for fbid in profileid.find_all("data-bt"):
worksheet.write(row,0,fbid.get("id"))
print (fbid.get("id"))
row += 1
我得到的回报是:
{"id":xxxxxx,"rank":1,"abtest_version":null,"abtest_params":{"abtest_version":null,"origin":"A","ranker":null},"section":"main_column","owner_id":null,"sub_id":null,"browse_location":null,"query_data":[],"is_headline":false}
我怎样才能让xxxxx
返回?提前谢谢。
答案 0 :(得分:1)
您可以解析data-bt
,因为它包含有效的json
。
import json
found = soup.find_all("div", "_3u1 _gli _uvb")
for fbid in found:
...
bt_json = json.loads(fbid.attrs['data-bt'])
print(bt_json['id'])
...