I want to download data from the Public Source https://fred.stlouisfed.org/categories/10/downloaddata for my academic assignment.
There was more than 10K Series Id present in source file, I want to get data for all the series Id. Available in https://fred.stlouisfed.org/categories/10/downloaddata/EP_csv_2.zip
Since python client "fredapi" gives data for each series id, to fetch and process it takes at least 2-3 seconds, there were 10K+ series ids present. It will take more than 5 hours
We can cannot fetch data for multiple series Id in single request. Is there any way to apply multithreading here? So that I can fetch data simultaneously to reduce process time.
Below Script is to fetch data one by one using For Loop, Is there any better way to reduce using multi hit?
import time
import pandas as pd
from fredapi import Fred
fred = Fred(api_key='QWERTYUIOPASDFGHJKLZXCVBNM')
Series_List = pd.read_excel(r'D:\Sunil_Work\temp8\Series_List.xlsx', sheet_name = 'Source', dtype=object)
Data_All = pd.DataFrame()
for I, Series_Id in enumerate(Series_List['Series_Id'], start=1):
Data = None
Started = time.time()
while Data is None:
try:
Data = fred.get_series(Series_Id)
except:
if (time.time() - Started)/60 > 1: # Multiple Try in 1 Minutes
break
Data = None
continue
if isinstance(Data, pd.Series):
Data = pd.DataFrame({'Date':Data.index, 'Value':Data.values})
Data['Series_Id'] = Series_Id
Data_All = Data_All.append(Data, sort=False)
else:
print('\n>> Error, NO valid Data Retrieved')