我有两列数据 (sample data)我想计算每个工作日的总用户数。
例如,我希望我的输出像这样(dict / list会做什么):
周一:25日, 星期二:30, 周三:45, 星期四:50, 周五:24, 星期六:22, 周日:21这是我的尝试:
def rider_ship (filename):
with open('./data/Washington-2016-Summary.csv','r') as f_in:
Sdict = []
Cdict = []
reader = csv.DictReader(f_in)
for row in reader:
if row['user_type']=="Subscriber":
if row['day_of_week'] in Sdict:
Sdict[row['day_of_week']]+=1
else:
Sdict [row['day_of_week']] = row['day_of_week']
else:
if row ['day_of_week'] in Cdict:
Cdict[row['day_of_week']] +=1
else:
Cdict[row['day_of_week']] = row['day_of_week']
return Sdict, Cdict
print (Sdict)
print (Cdict)
t= rider_ship ('./data/Washington-2016-Summary.csv')
print (t)
TypeError :: list indices必须是整数或切片,而不是str
答案 0 :(得分:0)
使用熊猫怎么样?
让我们首先使用io库创建一个类文件对象:
import io
s = u"""day_of_week,user_type
Monday,subscriber
Tuesday,customer
Tuesday,subscriber
Tuesday,subscriber"""
file = io.StringIO(s)
现在到实际代码:
import pandas as pd
df = pd.read_csv(file) # "path/to/file.csv"
Sdict = df[df["user_type"] == "subscriber"]["day_of_week"].value_counts().to_dict()
Cdict = df[df["user_type"] == "customer"]["day_of_week"].value_counts().to_dict()
现在我们有:
Sdict = {'Tuesday':2,'Monday':1}
Cdict = {'Tuesday':1}