我试图在python中解析csv文件并打印每天order_total
的总和。以下是示例csv文件
order_total created_datetime
24.99 2015-06-01 00:00:12
0 2015-06-01 00:03:15
164.45 2015-06-01 00:04:05
24.99 2015-06-01 00:08:01
0 2015-06-01 00:08:23
46.73 2015-06-01 00:08:51
0 2015-06-01 00:08:58
47.73 2015-06-02 00:00:25
101.74 2015-06-02 00:04:11
119.99 2015-06-02 00:04:35
38.59 2015-06-02 00:05:26
73.47 2015-06-02 00:06:50
34.24 2015-06-02 00:07:36
27.24 2015-06-03 00:01:40
82.2 2015-06-03 00:12:21
23.48 2015-06-03 00:12:35
我的目标是每天打印sum(order_total)
。例如,结果应为
2015-06-01 -> 261.16
2015-06-02 -> 415.75
2015-06-03 -> 132.92
我编写了下面的代码 - 它还没有执行逻辑,但是我试图通过打印一些示例语句来查看它是否能够根据需要进行解析和循环。
def sum_orders_test(self,start_date,end_date):
initial_date = datetime.date(int(start_date.split('-')[0]),int(start_date.split('-')[1]),int(start_date.split('-')[2]))
final_date = datetime.date(int(end_date.split('-')[0]),int(end_date.split('-')[1]),int(end_date.split('-')[2]))
day = datetime.timedelta(days=1)
with open("file1.csv", 'r') as data_file:
next(data_file)
reader = csv.reader(data_file, delimiter=',')
order_total=0
if initial_date <= final_date:
for row in reader:
if str(initial_date) in row[1]:
print 'initial_date : ' + str(initial_date)
print 'Date : ' + row[1]
order_total = order_total + row[0]
else:
print 'Else'
print 'Date ' + str(row[1]) + 'Total ' +str(order_total)
order_total=0
initial_date = initial_date + day
根据我目前的逻辑,我遇到了这个问题 -
使用sum_orders_test('2015-06-01','2015-06-03');
我知道有一些愚蠢的逻辑问题,但是对编程和python不熟悉我无法弄明白。
答案 0 :(得分:0)
使用pandas
库的简短解决方案:
import pandas as pd
df = pd.read_table('yourfile.csv', sep=r'\s{2,}', engine='python')
sums = df.groupby(df.created_datetime.str[:11]).sum()
print(sums)
输出:
order_total
created_datetime
2015-06-01 261.16
2015-06-02 415.76
2015-06-03 132.92
df.created_datetime.str[:11]
- 仅考虑yyyy-mm-dd
列中的日期值(即created_datetime
)作为分组值
.sum()
- 汇总分组值
答案 1 :(得分:0)
使用dictionary
:
data = [
(24.99 ,'2015-06-01 00:00:12'),
(0 ,'2015-06-01 00:03:15'),
(164.45 ,'2015-06-01 00:04:05'),
(24.99 ,'2015-06-01 00:08:01'),
(0 ,'2015-06-01 00:08:23'),
(46.73 ,'2015-06-01 00:08:51'),
(0 ,'2015-06-01 00:08:58'),
(47.73 ,'2015-06-02 00:00:25'),
(101.74 ,'2015-06-02 00:04:11'),
(119.99 ,'2015-06-02 00:04:35'),
(38.59 ,'2015-06-02 00:05:26'),
(73.47 ,'2015-06-02 00:06:50'),
(34.24 ,'2015-06-02 00:07:36'),
(27.24 ,'2015-06-03 00:01:40'),
(82.2 ,'2015-06-03 00:12:21'),
(23.48 ,'2015-06-03 00:12:35')
]
def sumByDay(data):
sums = {}
# loop through each entry and add the order value to it's corresponding day entry in dictionary
for x in data:
day = x[1].split()[0] # get the date portion from the string
order = x[0]
sums[day]= sums.get(day, 0) + order
return sums
sums = sumByDay(data)
for key in sums:
print(key, sums[key])
输出:
2015-06-01 261.16
2015-06-02 415.76
2015-06-03 132.92