从多个列表中创建熊猫数据框

时间:2019-12-02 14:37:30

标签: python pandas

我是熊猫和蟒蛇的新手。

我正在尝试从7个列表中创建一个熊猫DataFrame。 7个列表中的每一个都具有以下结构:

[{'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzru7hs6V5gIVC', 'account_id': 85194250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - German', 'adgroup_name': 'bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsL0mdmW5gIVjLT', 'account_id': 85994250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - French', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2056'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsfm8-qW5gIVibTtCh2Jx__D_BwE', 'account_id': 8593250, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - Italian', 'adgroup_name': 'vpn gratis | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2380'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsj0o7GW5gID_BwE', 'account_id': 85931250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - Swedish', 'adgroup_name': 'exact/bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2752'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobYASAAEgKx6fD_BwE', 'account_id': 854250, 'account_name': 'T2', 'campaign_name': 'Exact/BMM - Dutch', 'adgroup_name': 'vpn verbinding | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2528'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobCVSrTtCh009QTtEAAYASAAEgLx9PD_BwE', 'account_id': 859350, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - German', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}]
[{'date': '2019-12-02', 'gclid': 'EAIaIQobwefwefwfChMIzru7hs6V5gIVC', 'account_id': 85194250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - German', 'adgroup_name': 'bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsL0mdmW5gIVjLT', 'account_id': 85994250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - French', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2056'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsfm8-qW5gIVibTtCh2Jx__D_BwE', 'account_id': 8593250, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - Italian', 'adgroup_name': 'vpn gratis | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2380'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsj0o7GW5gID_BwE', 'account_id': 85931250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - Swedish', 'adgroup_name': 'exact/bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2752'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobYASAAEgKx6fD_BwE', 'account_id': 854250, 'account_name': 'T2', 'campaign_name': 'Exact/BMM - Dutch', 'adgroup_name': 'vpn verbinding | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2528'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobCVSrTtCh009QTtEAAYASAAEgLx9PD_BwE', 'account_id': 859350, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - German', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}]
[{'date': '2019-12-02', 'gclid': 'EAIaIqdfwefwfChMIzru7hs6V5gIVC', 'account_id': 85194250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - German', 'adgroup_name': 'bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsL0mdmW5gIVjLT', 'account_id': 85994250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - French', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2056'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsfm8-qW5gIVibTtCh2Jx__D_BwE', 'account_id': 8593250, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - Italian', 'adgroup_name': 'vpn gratis | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2380'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsj0o7GW5gID_BwE', 'account_id': 85931250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - Swedish', 'adgroup_name': 'exact/bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2752'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobYASAAEgKx6fD_BwE', 'account_id': 854250, 'account_name': 'T2', 'campaign_name': 'Exact/BMM - Dutch', 'adgroup_name': 'vpn verbinding | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2528'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobCVSrTtCh009QTtEAAYASAAEgLx9PD_BwE', 'account_id': 859350, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - German', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}]
...

列表中的每个项目有9个键:

'date'
'gclid'
'account_id'
'account_name'
'campaign_name'
'adgroup_name'
'source'
'clicks'
'criteria_id_country'

我正在尝试创建一个具有这些列并保存这些列表中的值的数据框:

date   gclid   account_id   account_name   adgroup_name   source   clicks   criteria_id_country

我正在使用此功能收集数据:

client_accounts = [1,2,3,4,5,6,7]

def get_full_click_list(account, date):
    full_list = []
    for item in client_accounts:
        full_list.append(get_adwords_clicks(item, date))
    print(full_list)
get_full_click_list(client_accounts, '2019-12-02')

我的full_list的结果具有以下结构:

[[items from 1st query],[items from 2nd query]...[items from 7th query]]

每个查询列表都具有以下结构:

[来自第1..7个查询的项目] =

[{'date': '2019-12-02', 'gclid': 'EAIaIqdfwefwfChMIzru7hs6V5gIVC', 'account_id': 85194250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - German', 'adgroup_name': 'bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsL0mdmW5gIVjLT', 'account_id': 85994250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - French', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2056'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsfm8-qW5gIVibTtCh2Jx__D_BwE', 'account_id': 8593250, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - Italian', 'adgroup_name': 'vpn gratis | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2380'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobChMIzsj0o7GW5gID_BwE', 'account_id': 85931250, 'account_name': 'T2', 'campaign_name': 'Generic - Exact/Bmm - Swedish', 'adgroup_name': 'exact/bmm', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2752'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobYASAAEgKx6fD_BwE', 'account_id': 854250, 'account_name': 'T2', 'campaign_name': 'Exact/BMM - Dutch', 'adgroup_name': 'vpn verbinding | exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2528'}, {'date': '2019-12-02', 'gclid': 'EAIaIQobCVSrTtCh009QTtEAAYASAAEgLx9PD_BwE', 'account_id': 859350, 'account_name': 'T2', 'campaign_name': 'Exact/Bmm - German', 'adgroup_name': 'exact', 'source': 'adwords', 'clicks': 1, 'criteria_id_country': 'geoTargetConstants/2276'}

我将如何继续尝试从我的full_list中提取信息?甚至我每次查询列表时如何添加信息到熊猫数据框?

谢谢您的建议。

3 个答案:

答案 0 :(得分:2)

您可以执行concat

ret_df = pd.concat(pd.DataFrame(lst) for lst in [lst1, lst2, lst3, ...])

更新:如果您一次创建一个列表,则可以附加到给定的数据框:

ret_df = None
for client in client_list:
    lst = get_full_click_list(client, date)
    if ret_df is None:
       ret_df = pd.DataFrame(client)
    else:
       ret_df = ret_df.append(pd.DataFrame(client))

答案 1 :(得分:0)

只需执行以下操作:

pd.DataFrame(list_of_dicts)

其中list_of_dicts是您拥有的每个字典的列表(基本上是您在上面编写的列表的总和)

list_a = [{"key1":"value1", "key2":"value2"}]
list_b = [{"key1":"value1", "key2":"value2"}]
list_c = [{"key1":"value1", "key2":"value2"}]

list_of_dicts = list_a + list_b + list_c

pd.DataFrame(list_of_dicts)

答案 2 :(得分:0)

# output = your output

df = pd.concat([pd.DataFrame(i[0]) for i in output], axis=0, sort=False)