我在一个名为
的文本文件中有以下数据表示data.txt中
03/05/2016 16:43 502
03/05/2016 16:43 502
03/05/2016 16:44 501
03/05/2016 16:44 504
03/05/2016 16:44 505
03/05/2016 16:44 506
04/05/2016 16:44 501
04/05/2016 16:45 501
04/05/2016 16:45 501
04/05/2016 16:45 52
04/05/2016 17:08 50
05/05/2016 17:08 502
05/05/2016 17:08 503
05/05/2016 17:08 504
05/05/2016 17:09 506
06/05/2016 17:09 507
06/05/2016 17:09 507
07/05/2016 17:09 508
07/05/2016 17:09 50
08/05/2016 17:10 5
08/05/2016 17:10 504
09/05/2016 17:10 504
09/05/2016 17:10 503
09/05/2016 17:10 503
10/05/2016 17:11 505
10/05/2016 17:11 505
我想执行某些数学运算,以便我可以获得最终结果
03/05/2016 3020
04/05/2016 1605
05/05/2016 2015
06/05/2016 5023
07/05/2016 1014
08/05/2016 558
09/05/2016 5023
10/05/2016 5022
第二列是值的总和
此结果存储在另一个文本文件中,例如data1.txt
我想在python 2.7中编写这段代码
我怎样才能实现这一目标....
答案 0 :(得分:2)
您可以使用Counter
对给定日期的值求和:
from collections import Counter
with open('data.txt') as f:
res = sum((Counter({d: int(c)}) for d, t, c in (line.split() for line in f)), Counter())
with open('data1.txt', 'wb') as f:
f.writelines('{0}\t{1}\n'.format(*x) for x in sorted(res.items()))
输出:
03/05/2016 3020
04/05/2016 1605
05/05/2016 2015
06/05/2016 1014
07/05/2016 558
08/05/2016 509
09/05/2016 1510
10/05/2016 1010
此解决方案不需要标准Python安装之外的任何库。
答案 1 :(得分:1)
一个纯粹的python解决方案:
import collections
data=collections.defaultdict(int)
with open('data.txt', 'r') as f:
for line in f:
row=line.split()
data[row[0]]+=int(row[2])
with open('data1.txt', 'w') as f:
for key, value in sorted(data.items()):
f.write(str(key)+" "+str(value)+"\n")
输出:
$ python a.py
$ cat data1.txt
03/05/2016 3020
04/05/2016 1605
05/05/2016 2015
06/05/2016 1014
07/05/2016 558
08/05/2016 509
09/05/2016 1510
10/05/2016 1010
$
答案 2 :(得分:1)
您可以使用以下内容:
from collections import OrderedDict
f = open('data.txt')
res = OrderedDict()
for line in f:
values = line.split(' ')
if len(values) == 4:
date = values[0]
val = values[3]
if res.get(date):
res[date] += int(val)
else:
res[date] = int(val)
f.close()
f = open('data1.txt', 'w')
for line in res.keys():
f.write('{} {}\n'.format(line, res[line]))
f.close()
答案 3 :(得分:0)
import pandas as pd
from StringIO import StringIO
text = """03/05/2016 16:43 502
03/05/2016 16:43 502
03/05/2016 16:44 501
03/05/2016 16:44 504
03/05/2016 16:44 505
03/05/2016 16:44 506
04/05/2016 16:44 501
04/05/2016 16:45 501
04/05/2016 16:45 501
04/05/2016 16:45 52
04/05/2016 17:08 50
05/05/2016 17:08 502
05/05/2016 17:08 503
05/05/2016 17:08 504
05/05/2016 17:09 506
06/05/2016 17:09 507
06/05/2016 17:09 507
07/05/2016 17:09 508
07/05/2016 17:09 50
08/05/2016 17:10 5
08/05/2016 17:10 504
09/05/2016 17:10 504
09/05/2016 17:10 503
09/05/2016 17:10 503
10/05/2016 17:11 505
10/05/2016 17:11 505"""
df = pd.read_csv(StringIO(text), delim_whitespace=True,
parse_dates=[0], names=['date', 'time', 'value'])
date time value
0 2016-03-05 16:43 502
1 2016-03-05 16:43 502
2 2016-03-05 16:44 501
3 2016-03-05 16:44 504
4 2016-03-05 16:44 505
5 2016-03-05 16:44 506
6 2016-04-05 16:44 501
7 2016-04-05 16:45 501
8 2016-04-05 16:45 501
9 2016-04-05 16:45 52
10 2016-04-05 17:08 50
11 2016-05-05 17:08 502
12 2016-05-05 17:08 503
13 2016-05-05 17:08 504
14 2016-05-05 17:09 506
15 2016-06-05 17:09 507
16 2016-06-05 17:09 507
17 2016-07-05 17:09 508
18 2016-07-05 17:09 50
19 2016-08-05 17:10 5
20 2016-08-05 17:10 504
21 2016-09-05 17:10 504
22 2016-09-05 17:10 503
23 2016-09-05 17:10 503
24 2016-10-05 17:11 505
25 2016-10-05 17:11 505
df.groupby('date').sum()