如何使用字典中的值填充DataFrame的行?

时间:2016-11-06 16:56:48

标签: python dictionary dataframe xlwings quandl

我正在从quandl.com下载财务数据集的元数据。来自quandl.com的数据已经是字典格式。我想从quandl.com获取这些数据并将其组织到DataFrame中,然后将其导入Excel。

这是文本文件('Indicator_list.txt'),其中包含我从quandl.com下载的金融数据集列表。我希望将这些符号中的每个元数据组织成一个DataFrame。

COM/OIL_WTI
BOE/XUDLADS
BOE/XUDLADD
BOE/XUDLB8KL
BOE/XUDLCDS
BOE/XUDLCDD

这是我正在运行的代码

import quandl
import pandas as pd

#This adjusts the layout in the command
#promt to have columns displayed side by side
pd.set_option('expand_frame_repr', False)

#This "with open" statment opens a text file that 
#has the symbols I want to get the metadata on 
with open ('Indicator_list.txt') as file_object:
    Current_indicators = file_object.read()
    tickers = Current_indicators.split('\n')

#quandlmetadata is a blank dictionary that I am
#appending the meatadata to
quandlmetadata={}

#this loops through all the values in 
#Indicator_list.txt"
for i in tickers:

    #metadata represents one set of metadata
    metadata = quandl.Dataset(i).data().meta

这是来自quandl.com的元数据输出

{'start_date': datetime.date(1975, 1, 2), 'column_names': ['Date', 'Value'], 'limit': None, 'collapse': None, 'order': 'asc', 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'column_index': None, 'frequency': 'daily'}

接下来,我将此添加到quandlmetadata字典中,并使用来自indicator_list.txt“i”的当前符号来命名我的字典键。

quandlmetadata[i]=(metadata)

这是quandlmetadata的输出

{'BOE/XUDLADS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLB8KL': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(2011, 8, 1), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'COM/OIL_WTI': {'column_names': ['date', 'value'], 'end_date': datetime.date(2016, 11, 4), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1983, 3, 30), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLADD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}}

最后,我希望将quandlmetadata字典转换为数据框(或其他更好的方式)

这是代码的最后一部分

df = pd.DataFrame(index = quandlmetadata.keys(),columns =['transform', 'frequency', 'limit', 'end_date', 'collapse', 'column_names','start_date', 'order', 'column_index']  )

df的输出

             transform frequency limit end_date collapse column_names start_date order column_index
BOE/XUDLB8KL       NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLADS        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLADD        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLCDS        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
COM/OIL_WTI        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN
BOE/XUDLCDD        NaN       NaN   NaN      NaN      NaN          NaN        NaN   NaN          NaN

df的输出正是我想要的; Indicator_list.txt中的ticker是我的索引,列是metadata.keys()。我唯一无法工作的是使用quandlmetadata字典值填充DataFrame的行。最终目标是能够将此列表导入到Excel中,因此如果有一种方法可以在不使用数据帧的情况下执行此操作,那么我对此非常开放。

1 个答案:

答案 0 :(得分:1)

也许你可以使用DataFrame.from_dict

In [15]: pd.DataFrame.from_dict(quandlmetadata, orient='index')
Out[15]: 
             column_index    end_date order   column_names  start_date collapse transform limit frequency
BOE/XUDLADD          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
BOE/XUDLADS          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
BOE/XUDLB8KL         None  2016-11-03   asc  [Date, Value]  2011-08-01     None      None  None     daily
BOE/XUDLCDD          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
BOE/XUDLCDS          None  2016-11-03   asc  [Date, Value]  1975-01-02     None      None  None     daily
COM/OIL_WTI          None  2016-11-04   asc  [date, value]  1983-03-30     None      None  None     daily

我不认为column_names列非常有用。您还希望在日期列上手动调用pd.to_datetime,这样他们就可以使用datetime64列而非字符串列。