如何从beautifulsoup添加字典元素到json文件

时间:2017-06-14 23:49:48

标签: json python-3.x dictionary beautifulsoup

你能帮我解决如何从字典导入json文件,我已经从网上获取了所有标签,但仍然混淆了保存所有标签。这是我的代码

array= []
data = {}
for divdata in soup.findAll('div', {"class": "ratio9_8 box_img fl mr10"}):
    for div in divdata.findAll('div', {'class': 'img_con lqd'}):
        for getatag in div.findAll('a', {'data-category': 'WP Kanal Berita'},href = True):
            for getimgtag in getatag.findAll('img',title=True,src=True):
                array.append(getimgtag['title'])
                array.append(getimgtag['src'])
                array.append(getatag['href'])
                data['title'] = array[0]
                data['image'] = array[1]
                data['link'] = array[2]
with open('data.json', 'w') as outfile:
    json.dump(data, outfile)

运行程序时,我只得到一个字典

{"title": "......", "image": ".....", "link": "...."}

1 个答案:

答案 0 :(得分:0)

将输出语句放在要分配数据的循环中。您正在覆盖每次迭代的数据。如果您将代码更改为:

array= []
data = {}
for divdata in soup.findAll('div', {"class": "ratio9_8 box_img fl mr10"}):
    for div in divdata.findAll('div', {'class': 'img_con lqd'}):
        for getatag in div.findAll('a', {'data-category': 'WP Kanal Berita'},href = True):
            for getimgtag in getatag.findAll('img',title=True,src=True):
                array.append(getimgtag['title'])
                array.append(getimgtag['src'])
                array.append(getatag['href'])
                data['title'] = array[0]
                data['image'] = array[1]
                data['link'] = array[2]
                with open('data.json', 'a') as outfile:
                    json.dump(data, outfile)

它应该给你你想要的东西。

或者你可以这样做:

array= []
data = {}
data_list = []
for divdata in soup.findAll('div', {"class": "ratio9_8 box_img fl mr10"}):
    for div in divdata.findAll('div', {'class': 'img_con lqd'}):
        for getatag in div.findAll('a', {'data-category': 'WP Kanal Berita'},href = True):
            for getimgtag in getatag.findAll('img',title=True,src=True):
                array.append(getimgtag['title'])
                array.append(getimgtag['src'])
                array.append(getatag['href'])
                data['title'] = array[0]
                data['image'] = array[1]
                data['link'] = array[2]
                data_list.append(data)
data = {'data_list': data_list}
with open('data.json', 'w') as outfile:
    json.dump(data, outfile)