Python List拆分,按日期排序,然后加入

时间:2015-04-08 08:59:05

标签: python list sorting lambda

好吧,我几个小时都在这个地方,我承认失败并乞求你的怜悯。

目标:我有多个文件(银行对帐单下载),我想 合并,排序,删除重复项。

下载采用以下格式:

"08/04/2015","Balance","5,804.30","Current Balance for account 123S14"
"08/04/2015","Balance","5,804.30","Available Balance for account 123S14"
"02/03/2015","241.25","Transaction description","2,620.09"
"02/03/2015","-155.49","Transaction description","2,464.60"
"03/03/2015","82.00","Transaction description","2,546.60"
"03/03/2015","243.25","Transaction description","2,789.85"
"03/03/2015","-334.81","Transaction description","2,339.12"
"04/03/2015","-25.05","Transaction description","2,314.07"
除了完全无知我正在做的事情之外,我的一个主要问题是数值包含逗号。我已经成功地编写了代码来删除这些“埋藏”的逗号,然后我删除引号以便我有一个CSV ...行。

所以我现在以这种格式获取数据

['02/03/2015', ' \t ', '241.25\t ', ' \t ', 'Transaction Details\n', '02/03/2015', ' \t ', ' \t ', '-155.49\t ', 'Transaction Details\n', '03/03/2015', ' \t ', '82.00\t ', ' \t ', 'Transaction Details\n', '03/03/2015', ' \t ', '243.25\t ', ' \t ', 'Transaction Details\n', '02/03/2015', ' \t ', '241.25\t ', ' \t ', 'Transaction Details\n']

我认为它几乎准备好对元素进行排序,但我认为它现在是一个很长的列表,而不是列表列表。

我研究了各种各样的并找到了lambda ...函数,所以我开始实现

new_file_data = sorted(new_file_data, key=lambda item: item[0])

但元素[0]只是“在BOL。

我还注意到我需要指示日期不是正确的格式,这导致了我的构造:

sorted(new_file_data, key=lambda d: datetime.strptime(d, '%d/%m/%Y'))

我松散地得到了'map'构造,但没有得到如何组合,我只能引用元素[0]以及如何引用它(日期)

现在我在这里,希望有人可以推动我超越这个障碍? 我认为我需要更好地拆分列表以便开始,因此每一行都是一个元素 - 我在某一点上获得了一个排序结果但是所有字段都被整合在一起,值(排序)然后是日期然后是单词等

因此,如果有人可以就我失败的列表操作提供一些建议,以及如何构建sort-lambda。

感谢那些有时间并且知道如何回应此类入门查询的人。

2 个答案:

答案 0 :(得分:2)

如果我理解正确,你想阅读csv的内容并按日期排序。

鉴于data.csv

的内容
"08/04/2015","Balance","5,804.30","Current Balance for account 123S14"
"08/04/2015","Balance","5,804.30","Available Balance for account 123S14"
"02/03/2015","241.25","Transaction description","2,620.09"
"02/03/2015","-155.49","Transaction description","2,464.60"
"03/03/2015","82.00","Transaction description","2,546.60"
"03/03/2015","243.25","Transaction description","2,789.85"
"03/03/2015","-334.81","Transaction description","2,339.12"
"04/03/2015","-25.05","Transaction description","2,314.07"

我会使用csv-module来读取数据。

import csv
with open('data.csv') as f:
    data = [row for row in csv.reader(f)]

给出了:

>>> data
[['08/04/2015', 'Balance', '5,804.30', 'Current Balance for account 123S14'],
 ['08/04/2015', 'Balance', '5,804.30', 'Available Balance for account 123S14'],
 ['02/03/2015', '241.25', 'Transaction description', '2,620.09'],
 ['02/03/2015', '-155.49', 'Transaction description', '2,464.60'],
 ['03/03/2015', '82.00', 'Transaction description', '2,546.60'],
 ['03/03/2015', '243.25', 'Transaction description', '2,789.85'],
 ['03/03/2015', '-334.81', 'Transaction description', '2,339.12'],
 ['04/03/2015', '-25.05', 'Transaction description', '2,314.07']]

然后,您可以使用datetime-module提供排序键。

import datetime 
sorted_data = sorted(data, key=lambda row: datetime.datetime.strptime(row[0], "%d/%m/%Y"))

给出了:

>>> sorted_data
[['02/03/2015', '241.25', 'Transaction description', '2,620.09'],
 ['02/03/2015', '-155.49', 'Transaction description', '2,464.60'],
 ['03/03/2015', '82.00', 'Transaction description', '2,546.60'],
 ['03/03/2015', '243.25', 'Transaction description', '2,789.85'],
 ['03/03/2015', '-334.81', 'Transaction description', '2,339.12'],
 ['04/03/2015', '-25.05', 'Transaction description', '2,314.07'],
 ['08/04/2015', 'Balance', '5,804.30', 'Current Balance for account 123S14'],
 ['08/04/2015', 'Balance', '5,804.30', 'Available Balance for account 123S14']]

答案 1 :(得分:1)

您可以定义自己的排序功能。

混合使用这两个问题,你会得到你想要的东西(或接近的东西):

Custom Python list sorting

Python date string to date object

在排序功能中,将日期从字符串转换为日期时间并比较

def cmp_items(a, b):
    datetime_a = datetime.datetime.strptime(a.[0], "%d/%m/%Y").date()
    datetime_b = datetime.datetime.strptime(a.[0], "%d/%m/%Y").date()
    if datetime_a > datetime_b:
        return 1
    elif datetime_a == datetime_b:
        return 0
    else:
        return -1

然后,你只需要使用它对列表进行排序

new_file_data = new_file_data.sort(cmp_items)

之后你仍然会遇到一些问题,具有相同日期的元素将是一个随机的顺序。您可以改进比较功能,以比较更多的东西,以防止这种情况。

顺便说一句,你还没有删除被隐藏的逗号,看来你已经完全删除了最后一部分。