我需要使用python
:
数据示例(制表符分隔):
UID DATE ACTIONS
abc123 12/25/2016 break, pullover
abc123 12/25/2016 stop
abc123 10/15/2015 break, pullover, turn
def456 6/14/2015 turn, wash, skid
def456 11/24/2016 stop, wash, pullover, break
ghi789 2/12/2015 pullover, stop
代码 - 使用@moogle评论修改
from collections import defaultdict
date = ['12/25/16','12/25/16','10/15/2015','6/14/2015','11/24/2016','2/12/2015']
uid = ['abc123','abc123', 'abc123','def456', 'def456', 'ghi789']
action = [['break', 'pullover'],['stop'],['break','pullover','turn'],['turn','wash','skid'],['stop','wash','pullover','break'],['pullover','stop']]
d = defaultdict(list)
for uid, date, action in zip(uid, date, action):
d[id].append((date,action))
print dict(d)
渴望输出
所需的输出是列表的嵌套字典。
其中父key
是ID,父value
是嵌套字典,其中嵌套key
是日期,嵌套value
是列表(操作)
当前实际输出
{'ghi789': [('2/12/2015', ['pullover', 'stop'])], 'def456': [('6/14/2015', ['turn', 'wash', 'skid']), ('11/24/2016', ['stop', 'wash', 'pullover', 'break'])], 'abc123': [('12/25/16', ['break', 'pullover']), ('12/25/16', ['stop']), ('10/15/2015', ['break', 'pullover', 'turn'])]}
**desired output**
{'abc123':[{'12/25/2016':[['break', 'pullover'],['stop']]}, {'10/15/2015':[['break','pullover','turn']]}],'def456':[{'6/14/2015':[['turn','wash','skid'],['stop','wash','pullover','break']},'ghi789':{'2/12/2915':[['pullover','stop']]}]}
我尝试使用上面的代码获取上述输出,我从HERE改编并查找HERE。但是,我一直在收到错误。我认为这与我试图在列表列表中嵌套的事实有关,我不确定要修复它的方向。
答案 0 :(得分:3)
我认为基于对象的方法对这些数据要好得多。
您可以执行以下操作:
class Event:
def __init__(self, ID, date, actions):
self.ID=ID
self.date=date
self.actions=actions
def __repr__(self):
return 'ID: {} date: {} actions: {}'.format(self.ID, self.date, self.actions)
然后创建一个对象列表,如下所示:
>>> objs=[Event(id_, d, actions) for id_, d, actions in zip(uid, date, action)]
>>> objs
[ID: abc123 date: 12/25/16 actions: ['break', 'pullover'], ID: abc123 date: 12/25/16 actions: ['stop'], ID: abc123 date: 10/15/2015 actions: ['break', 'pullover', 'turn'], ID: def456 date: 6/14/2015 actions: ['turn', 'wash', 'skid'], ID: def456 date: 11/24/2016 actions: ['stop', 'wash', 'pullover', 'break'], ID: ghi789 date: 2/12/2015 actions: ['pullover', 'stop']]
然后可以根据需要对动作/事件列表进行排序,分析和保存。
按日期排序:
>>> sorted(objs, key=lambda o: o.date)
[ID: abc123 date: 10/15/2015 actions: ['break', 'pullover', 'turn'], ID: def456 date: 11/24/2016 actions: ['stop', 'wash', 'pullover', 'break'], ID: abc123 date: 12/25/16 actions: ['break', 'pullover'], ID: abc123 date: 12/25/16 actions: ['stop'], ID: ghi789 date: 2/12/2015 actions: ['pullover', 'stop'], ID: def456 date: 6/14/2015 actions: ['turn', 'wash', 'skid']]
按事件:
>>> [o for o in objs if 'stop' in o.actions]
[ID: abc123 date: 12/25/16 actions: ['stop'], ID: def456 date: 11/24/2016 actions: ['stop', 'wash', 'pullover', 'break'], ID: ghi789 date: 2/12/2015 actions: ['pullover', 'stop']]
然后创建一个类似于你想要的dict(尽管那个例子不是合法的Python dict ......)是相当明显的:
di={o.ID:[] for o in objs}
for user in di:
di[user].append({o.date:o.actions for o in objs if o.ID==user})
>>> di
{'ghi789': [{'2/12/2015': ['pullover', 'stop']}], 'def456': [{'6/14/2015': ['turn', 'wash', 'skid'], '11/24/2016': ['stop', 'wash', 'pullover', 'break']}], 'abc123': [{'10/15/2015': ['break', 'pullover', 'turn'], '12/25/16': ['stop']}]}
答案 1 :(得分:1)
如果我在“操作”列表中添加了缺少的逗号,那么您的代码适用于我...
我得到了输出:
{'ghi789': [('2/12/2015', ['pullover', 'stop'])], 'def456': [('6/14/2015', ['turn', 'wash', 'skid']), ('11/24/2016', ['stop', 'wash', 'pullover', 'break'])], 'abc123': [('12/25/16', ['break', 'pullover']), ('12/25/16', ['stop']), ('10/15/2015', ['break', 'pullover', 'turn'])]}
以下解决方案如何:
from collections import defaultdict
date = ['12/25/16','12/25/16','10/15/2015','6/14/2015','11/24/2016','2/12/2015']
uid = ['abc123','abc123', 'abc123','def456', 'def456', 'ghi789']
action = [['break', 'pullover'],['stop'],['break','pullover','turn'],['turn','wash','skid'],['stop','wash','pullover','break'],['pullover','stop']]
d = defaultdict(dict)
for uid, date, action in zip(uid, date, action):
d[uid].setdefault(date,[]).append(action)
print dict(d)
输出:
{'ghi789': {'2/12/2015': [['pullover', 'stop']]}, 'def456': {'6/14/2015': [['turn', 'wash', 'skid']], '11/24/2016': [['stop', 'wash', 'pullover', 'break']]}, 'abc123': {'10/15/2015': [['break', 'pullover', 'turn']], '12/25/16': [['break', 'pullover'], ['stop']]}}