在Python中将每个唯一键(例如,Person)的.csv文件合并为一行

时间:2013-04-04 16:18:50

标签: python csv

我有什么

我有一个.csv文件,其中列出了员工及其在某一天的班次,如下所示:

Initials,Last,First,ShiftStart,ShiftEnd
BAB,Smith,Bob,10:00a,1:00p
JCJ,Jones,Jill,11:00a,3:00p
JIH,Hernandez,Jose,1:00p,4:00p
BAB,Smith,Bob,1:00p,3:00p
JIH,Hernandez,Jose,5:00p,9:00p
JCJ,Jones,Jill,3:00p,3:30p
JCJ,Jones,Jill,3:30p,5:00p
DJM,Martin,Dominique,8:00a,11:00a

请注意一个人如何进行多次轮班,下一班次的开始时间可能与另一班次的结束时间相同或不同,并且每位员工的姓名首字母都标识为唯一标识符(合适的)用作钥匙。)

我想要什么

我想整合这个.csv文件,这样每个员工只有一行。如果那个人有多个班次,那么检查一个班次的结束时间是否与另一个班次的开始时间相同,然后将这些班次组合起来,但如果没有,则添加两个新列2ndShiftStart和2ndShiftEnd并将那些数据放在那里。 / p>

结果应如下所示:

Initials,Last,First,ShiftStart,ShiftEnd,2ndShiftStart,2ndShiftEnd
BAB,Smith,Bob,10:00a,3:00p,,
JCJ,Jones,Jill,11:00a,5:00p,,
JIH,Hernandez,Jose,1:00p,4:00p,5:00p,9:00p
DJM,Martin,Dominique,8:00a,11:00a,,
例如,BAB工作时间为上午10点至下午1点,然后是下午1点至下午3点,因此产生的.csv将他列为工作时间为上午10点至下午3点。

1 个答案:

答案 0 :(得分:1)

#!/usr/bin/env python
import sys
##Initials,Last,First,ShiftStart,ShiftEnd
s='''BAB,Smith,Bob,10:00a,1:00p
JCJ,Jones,Jill,11:00a,3:00p
JIH,Hernandez,Jose,1:00p,4:00p
BAB,Smith,Bob,1:00p,3:00p
JIH,Hernandez,Jose,5:00p,9:00p
JCJ,Jones,Jill,3:00p,3:30p
JCJ,Jones,Jill,3:30p,5:00p
DJM,Martin,Dominique,8:00a,11:00a'''

db = {}
for line in s.split('\n'):
     Initials,Last,First,ShiftStart,ShiftEnd = line.split(',')
     if Initials in db:
         db[Initials][2].append((ShiftStart,ShiftEnd))
     else:
         db[Initials] = (Last,First,[(ShiftStart,ShiftEnd)])
for Initials,v in db.iteritems():
    Last,First,shifts = v
    sys.stdout.write(Initials + ',')
    sys.stdout.write(Last + ',' + First)
    for shift in shifts:
        ShiftStart,ShiftEnd = shift
        sys.stdout.write(',' + ShiftStart + ',' + ShiftEnd)
    sys.stdout.write('\n')

或者,你可以做一个非常面向对象的程序:

import sys
##Initials,Last,First,ShiftStart,ShiftEnd
s='''BAB,Smith,Bob,10:00a,1:00p
JCJ,Jones,Jill,11:00a,3:00p
JIH,Hernandez,Jose,1:00p,4:00p
BAB,Smith,Bob,1:00p,3:00p
JIH,Hernandez,Jose,5:00p,9:00p
JCJ,Jones,Jill,3:00p,3:30p
JCJ,Jones,Jill,3:30p,5:00p
DJM,Martin,Dominique,8:00a,11:00a'''

class Shift(object):
    def __init__(self,ShiftStart,ShiftEnd):
        self.ShiftStart,self.ShiftEnd = ShiftStart,ShiftEnd
    def __str__(self):
        return '%s,%s' % (ShiftStart,ShiftEnd)

class Person(object):
    def __eq__(self, p):
        if self.Initials != p.Initials:
            return False
        if p.Last is not None and self.Last != p.Last:
            return False
        if p.First is not None and self.First != p.First:
            return False
        return True
    def __init__(self,Initials,Last,First):
        self.Initials,self.Last,self.First = Initials,Last,First
        self.Shifts = []
    def __str__(self):
        return '%s,%s,%s' % (self.Initials,self.Last,self.First)

def AddShift(people, person, shift):
    try:
        person = people[people.index(person)]
    except ValueError:
        people.append(person)
    person.Shifts.append(shift)

people = []
for line in s.split('\n'):
     Initials,Last,First,ShiftStart,ShiftEnd = line.split(',')
     AddShift(people, Person(Initials,Last,First), Shift(ShiftStart,ShiftEnd))

for person in people:
    print '%s,%s' %(person, ','.join(map(str,person.Shifts)))