使用python将数据排列在文本文件中:按月分组数据

时间:2016-06-10 18:33:30

标签: python python-2.7 logic

我有一个名为data.txt的文本文件,其中包含以下信息

03/05/2016  502
04/05/2016  502
05/05/2016  501
07/05/2016  504
09/05/2016  505
13/05/2016  506
23/05/2016  501
30/05/2016  501
02/06/2016  502
04/06/2016  502
06/06/2016  501
07/06/2016  504
08/06/2016  505
13/06/2016  506
25/06/2016  499
31/06/2016  501
04/07/2016  501

我希望输出像这样。此数据应存储在另一个名为reslt.txt的文件中  (修订版)

03/05/2016 - 30/05/2016  4022
02/06/2016 - 31/06/2016  4020
01/07/2016 - 01/07/2016  501

reslt.txt文件中的第3列是data.txt文件中第2列的值的总和。 我使用python 2.7,我不知道如何实现这一点 请帮帮我manz

更新2

03/05/2016  502
04/05/2016  502.2
05/05/2016  501.9
07/05/2016  504.6
09/05/2016  505
13/05/2016  506.1
23/05/2016  501.3
30/05/2016  501.4
02/06/2016  502
04/06/2016  502
06/06/2016  501
07/06/2016  504
08/06/2016  505
13/06/2016  506
25/06/2016  499
31/06/2016  501
04/07/2016  501 

2 个答案:

答案 0 :(得分:0)

看起来输出要求有点变化!然而,这应该提供足够的动力来消除粗糙的边缘。

dataStore = {}

# Method to process an input line
def processLine(dateStr, val):
  if dateStr not in dataStore:
    dataStore[dateStr] = val
  else:
    dataStore[dateStr] += val

# Method to read input file line by line
def doStuff(inFile, outFile):
  with open(inFile, 'r') as fp:
    for line in fp:
      dateStr, val = line.split()

      # cast decimal value to integer
      val = int(val)

      # process the date string to only keep the month and year
      dateStr = dateStr.split('/')
      dateStr = "/".join((dateStr[1], dateStr[2]))

      processLine(dateStr, val)

  # once you are done reading file, generate output
  writeBuf = []
  for key in dataStore:
    writeBuf.append((key, dataStore[key]))
  writeBuf.sort()

  with open(outFile, 'wb') as fp:
    for tup in writeBuf:
      line = '01/'+tup[0]+' - 30/'+tup[0] + '  ' + str(tup[1]) + '\n'
      fp.write(line)

if __name__ == '__main__':
  inFile = 'data.txt'
  outFile = 'result.txt'

  doStuff(inFile, outFile)

您可以轻松扩展此功能以包含当天。只需修改我处理dateStr的部分即可。 processLine方法也会发生变化。

StackOverflow 意味着让其他人完成整个作业。显示您当前的进度,并随时寻求有关错误和改进的帮助。下次在这里寻求帮助时请记住这一点。

答案 1 :(得分:0)

import re 
from collections import defaultdict

def sum_months(data_path):
    with open (data_path, 'r') as f:
        rows = f.readlines()
        sumdict  = defaultdict(int)
        for row in rows:
            month = re.findall("/\d{2}/\d{4}", row)[0]
            sum = re.findall("\d+$", row)[0]
            sumdict[month] += eval(sum) 
    return sumdict   

def pad_strings_and_create_rows(sumdict):
    rows = []
    for k, v in sumdict.iteritems():
        rows.append('01' + k + ' - ' + '30' + k + ' ' + str(v))
    return list(sorted(rows))            

def write_result_to_file(results_lst):
    with open('reslt.txt', 'a') as f:
        for row in results_lst:
            f.write(row + '\n')  

write_result_to_file(pad_strings_and_create_rows(sum_months('data.txt')))