对嵌套列表进行排序和分组由类对象组成

时间:2016-08-08 06:49:46

标签: python list sorting object grouping

我有数百个文本文件需要根据用户名和日期进行解析。我试图将有用的数据放在这样的列表中的文本文件中:

    [
      ['1234245@gmail.com', '34209809' '1434546354', '2016-07-18 00:20:58'], 
      ['abcd@gmail.com', '234534345', '09402380',, '2016-07-18 00:20:03'], 
      ['username@gmail.com', '345315531','1098098098', '2016-07-18 02:40:00'], 
      ['abcd@gmail.com', '345431353', '231200023', '2016-07-18 15:45:49'], 
      ['1234245@gmail.com', '23232424', '234809809', '2016-07-18 20:45:40']
    ]

但是,我想根据日期时间和用户名分组对它们进行排序,以便输出如下:

    [
     ['1234245@gmail.com', '23232424', '234809809', '2016-07-18 20:45:40'],
     ['1234245@gmail.com', '34209809' '1434546354', '2016-07-18 00:20:58'],
     ['abcd@gmail.com', '345431353', '231200023', '2016-07-18 15:45:49'],
     ['abcd@gmail.com', '234534345', '09402380',, '2016-07-18 00:20:03'],
     ['username@gmail.com', '345315531','1098098098', '2016-07-18 02:40:00']
    ]

这是我的代码:

    import glob
    from operator import itemgetter
    from itertools import groupby
    def read_large_file(filename):
        matrix=[]
        global username
        username=[]
        for myfile in glob.glob(filename):
            infile = open(myfile, "r")
            for row in infile:
                row=row.strip()
                array=row.split(';') 
                username.append(array[9])
                matrix.append(cdr(array[9],array[17],array[18],array[8]))

        return matrix


    class cdr(object):               
        def__init__(self,username,total_seconds_since_start,download_bytes,date_time):
            self.username=username
            self.total_seconds_since_start=total_seconds_since_start
            self.download_bytes=download_bytes
            self.date_time=date_time


    def GroupByUsername(matrix):
        new_matrix=[]
        new_matrix=groupby(matrix, itemgetter(0))
        return new_matrix

    matrix=read_large_file('C:\Users\ceren\.spyder2/test/*')
    matrix_new=GroupByUsername(matrix)

我尝试使用此链接中的解决方案:Sorting and Grouping Nested Lists in Python但是我遇到了这些错误:

   'cdr' object does not support indexing
   'cdr' object is not iterable

1 个答案:

答案 0 :(得分:2)

您可以使用简单的Python内置排序。

sorted_list = sorted(data, key=lambda user_info: (user_info[0], user_info[3]))

lambda键告诉Python如何对列表进行排序(升序)。对于data中的每个条目,user_info将是4个属性的列表。因此,user_info[0]将成为电子邮件,user_info[3]将成为日期时间。