在下面找到从excel文件计算每个城市中的订阅者,客户和其他客户总数的代码,并计算他们在每个城市的平均旅行时间。有没有办法在下面的代码中简化for循环中的If,elif语句?
new_file = {'Washington': './data/Washington-2016-Summary.csv',
'Chicago': './data/Chicago-2016-Summary.csv',
'NYC': './data/NYC-2016-Summary.csv'}
for city, filename in new_file.items():
with open (filename, 'r') as fil_1:
t_subscriber = 0
t_customers = 0
cnt_subscribers = 0
cnt_customers = 0
other_customers = 0
file_reader = csv.DictReader(fil_1)
for row in data_reader:
if row['user_type'] == 'Subscriber':
cnt_subscribers += 1
t_subscribers += float(row['duration'])
elif row['user_type'] == 'Customer':
cnt_customers += 1
t_customers += float(row['duration'])
elif row['user_type'] == '':
other_customers += 1
t_customers += float(row['duration'])
tripaverage_duration = (t_subscribers+t_customers)/60)/(cnt_subscribers+cnt_customers+other_customers)
tripaverage_subscribers = (t_subscribers/60)/cnt_subscribers
tripaverage_subscribers = (t_customers/60)/cnt_customers
print ('Average trip duration in', city,'-'
,tripaverage_duration,'minutes')
print ('Average trip duration for subscribers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('Average trip duration for customers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('\n')
答案 0 :(得分:0)
我推荐Pandas dataframes这样的事情。您可以根据另一列中的值轻松地对数据框进行子集化,并对值进行求和,计算数字等。以下是如何将此问题应用于您的问题的示例:
import pandas as pd
new_file = {'Washington': './data/Washington-2016-Summary.csv',
'Chicago': './data/Chicago-2016-Summary.csv',
'NYC': './data/NYC-2016-Summary.csv'}
for city, filename in new_file.items():
data = pd.read_csv(filename)
tripaverage_duration = data.values.mean()['duration']
tripaverage_subscribers = data[data['user_type']=='Subscriber'].values.mean()['duration']
tripaverage_customers = data[data['user_type']=='Customer'].values.mean()['duration']
print ('Average trip duration in', city,'-'
,tripaverage_duration,'minutes')
print ('Average trip duration for subscribers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('Average trip duration for customers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('\n')
答案 1 :(得分:0)
一种选择是使用这样的列表推导:
cnt_subscribers = sum([1 for row in data_reader if row['user_type'] == 'Subscriber'])
t_subscribers = sum([float(row['duration']) for row in data_reader if row['user_type'] == 'Subscriber'])