在检索各自的业务ID后,我试图提取点击Yelp API的餐馆列表的营业时间:
我最初定义的函数是:
def is_clocked(business_id):
#import pdb; pdb.set_trace()
try:
clocked_ind = get_business(API_KEY, business_id)
clocked_ind1 = clocked_ind['hours']
except:
clocked_ind1 = 'None'
return clocked_ind1
clocked_ind = is_clocked(b_id)
print(clocked_ind)
但是,此函数向我返回的是长数据而不是宽数据格式:
bad_format:
Querying https://api.yelp.com/v3/businesses/9GzjKeifGJ6KzWkaPftYHg ...
[{'open': [{'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 0}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 1}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 2}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 3}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 4}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 5}, {'is_overnight': False, 'start': '1100', 'end': '2100', 'day': 6}], 'hours_type': 'REGULAR', 'is_open_now': True}]
我希望我的最终输出在csv中看起来像这样:
(输入)
day = [0, 1,2,3,4,5,6]
start = [1100, 1100, 1100, 1100, 1100, 1100, 1100]
end = [2200, 2200, 2200, 2200, 2200, 2200, 2100]
day1 = []
for i in day:
day1.append("start"+str(i))
for i in range(len(day1)):
merge_HOO[day1[i]]=start[i]
pd.DataFrame(merge_HOO, index=[0])
#Desired Output[115]:
item day start end end0 ... start2 start3 start4 start5 start6
0 0 0 1100 2200 2200 ... 1100 1100 1100 1100 1100
但是您要注意:我为一次特定业务手动编码了输入。我想创建一个循环,使其在csv中返回每个business_id的所需输出。我也在下面编写了代码,但是我觉得必须有一种更好的方法来进行此循环。下面的代码必须是一个函数:
day = day_open(b_id)
start = day_start(b_id)
end = day_end(b_id)
day1 = []
for i in day:
day1.append("start"+str(i))
dict1 = {}
for i in range(len(day1)):
dict1[day1[i]]=start[i]
start_df = pd.DataFrame(dict1, index=[0])
day2 = []
for i in day:
day2.append("end"+str(i))
dict2 = {}
for i in range(len(day2)):
dict2[day2[i]]=end[i]
end_df = pd.DataFrame(dict2, index=[0])
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
这是一个非常复杂的难题!