在检索各自的业务ID后,我试图提取点击Yelp API的餐馆列表的营业时间:
我最初定义的函数是:
def is_clocked(business_id):
#import pdb; pdb.set_trace()
try:
clocked_ind = get_business(API_KEY, business_id)
clocked_ind1 = clocked_ind['hours']
except:
clocked_ind1 = 'None'
return clocked_ind1
clocked_ind = is_clocked(b_id)
print(clocked_ind)
但是,此函数向我返回的是长数据而不是宽数据格式:
bad_format:
Querying https://api.yelp.com/v3/businesses/9GzjKeifGJ6KzWkaPftYHg ...
[{'open': [{'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 0}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 1}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 2}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 3}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 4}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 5}, {'is_overnight': False, 'start': '1100', 'end': '2100', 'day': 6}], 'hours_type': 'REGULAR', 'is_open_now': True}]
我希望我的最终输出在csv中看起来像这样:
(输入)
day = [0, 1,2,3,4,5,6]
start = [1100, 1100, 1100, 1100, 1100, 1100, 1100]
end = [2200, 2200, 2200, 2200, 2200, 2200, 2100]
day1 = []
for i in day:
day1.append("start"+str(i))
for i in range(len(day1)):
merge_HOO[day1[i]]=start[i]
pd.DataFrame(merge_HOO, index=[0])
#Desired Output[115]:
item day start end end0 ... start2 start3 start4 start5 start6
0 0 0 1100 2200 2200 ... 1100 1100 1100 1100 1100
但是您要注意:我为一次特定业务手动编码了输入。我想创建一个循环,使其在csv中返回每个business_id的所需输出。我也在下面编写了代码,但是我觉得必须有一种更好的方法来进行此循环。下面的代码必须是一个函数:
day = day_open(b_id)
start = day_start(b_id)
end = day_end(b_id)
day1 = []
for i in day:
day1.append("start"+str(i))
dict1 = {}
for i in range(len(day1)):
dict1[day1[i]]=start[i]
start_df = pd.DataFrame(dict1, index=[0])
day2 = []
for i in day:
day2.append("end"+str(i))
dict2 = {}
for i in range(len(day2)):
dict2[day2[i]]=end[i]
end_df = pd.DataFrame(dict2, index=[0])
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
start_df['end0']=end_df['end0']
我想像下面这样使它适应循环:
def id_loop(a):
empty = []
for i in input_range:
review_count_ind = is_review_count(a[i])
empty.append(review_count_ind)
return empty
c = id_loop(a)
答案 0 :(得分:0)
您的预期输出与您的数据不一致。 item
和is_overnight
是一样的东西吗?您有end
和end0
,这是什么意思?因此,我假设item
是is_overnight
,并且所有列名都应附加日期。因为您的结果是一个列表,所以我还假设其长度可以更改。
import pandas as pd
clocked_ind = [{'open': [{'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 0}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 1}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 2}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 3}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 4}, {'is_overnight': False, 'start': '1100', 'end': '2200', 'day': 5}, {'is_overnight': False, 'start': '1100', 'end': '2100', 'day': 6}], 'hours_type': 'REGULAR', 'is_open_now': True}]
new_list = [[] for _ in range(len(clocked_ind)+1)]
for i in range(len(clocked_ind)):
new_list[i+1] = []
for day_dict in clocked_ind[i]['open']:
suffix = str(day_dict['day'])
for k, v in day_dict.items():
key = k if k != 'is_overnight' else 'item'
column_name = key + suffix
if column_name not in new_list[0]:
new_list[0].append(column_name)
new_list[i+1].append(v) # if you want item to store int instead of bool replace with append(int(v))
df = pd.DataFrame(new_list[1:], columns=new_list[0])
print(df)
输出
item0 day0 end0 start0 item1 day1 end1 start1 item2 day2 ... \
0 False 0 2200 1100 False 1 2200 1100 False 2 ...
end4 start4 item5 day5 end5 start5 item6 day6 end6 start6
0 2200 1100 False 5 2200 1100 False 6 2100 1100
[1 rows x 28 columns]