我有一个名为past_trend的pandas数据框,看起来像这样
created moans thanks
0 2016-12-16 0 0
1 2016-12-17 0 0
2 2016-12-18 0 0
3 2016-12-19 0 2
4 2016-12-20 6 0
5 2016-12-21 0 0
6 2016-12-22 0 2
我试图将其转换为类似于
的字典{"moans": [
["16 Dec", 0],
["17 Dec", 0],
["18 Dec", 0],
["19 Dec", 2],
["20 Dec", 0],
["21 Dec", 0],
["22 Dec", 2]
],
"thanks": [
["16 Dec", 0],
["17 Dec", 0],
["18 Dec", 0],
["19 Dec", 0],
["20 Dec", 6],
["21 Dec", 0],
["22 Dec", 0]
]}
日期格式不必像上面所示那样严格,它也可以是。事情是当我使用to_dict函数时,我得到一个看起来像这样的输出
{'created': {0: Timestamp('2016-12-16 00:00:00'),
1: Timestamp('2016-12-17 00:00:00'),
2: Timestamp('2016-12-18 00:00:00'),
3: Timestamp('2016-12-19 00:00:00'),
4: Timestamp('2016-12-20 00:00:00'),
5: Timestamp('2016-12-21 00:00:00'),
6: Timestamp('2016-12-22 00:00:00')},
'moans': {0: 0, 1: 0, 2: 0, 3: 0, 4: 6, 5: 0, 6: 0},
'thanks': {0: 0, 1: 0, 2: 0, 3: 2, 4: 0, 5: 0, 6: 2}}
所以我将组类型(呻吟,谢谢)转换为列表,并试图迭代它。我已经走到了这一步,如下所示。
#now create the result we want
result = {}
group_types = ['moans', 'thanks']
for group in group_types:
result[group]={[past_trend['created'],past_trend[group]]}
result
但是我收到了错误
TypeError: unhashable type: 'list'
答案 0 :(得分:1)
这里正在进行中。
In [99]: {k: [[x, y] for x, y in v.items()]
for k, v in df.set_index('created').to_dict().iteritems()}
Out[99]:
{'moans': [['2016-12-22', 0],
['2016-12-20', 6],
['2016-12-21', 0],
['2016-12-19', 0],
['2016-12-18', 0],
['2016-12-17', 0],
['2016-12-16', 0]],
'thanks': [['2016-12-22', 2],
['2016-12-20', 0],
['2016-12-21', 0],
['2016-12-19', 2],
['2016-12-18', 0],
['2016-12-17', 0],
['2016-12-16', 0]]}
答案 1 :(得分:1)
这应该这样做
{k: [[i.strftime('%d %b'), v] for i, v in s.iteritems()]
for k, s in df.set_index('created').iteritems()}
{'moans': [['16 Dec', 0],
['17 Dec', 0],
['18 Dec', 0],
['19 Dec', 0],
['20 Dec', 6],
['21 Dec', 0],
['22 Dec', 0]],
'thanks': [['16 Dec', 0],
['17 Dec', 0],
['18 Dec', 0],
['19 Dec', 2],
['20 Dec', 0],
['21 Dec', 0],
['22 Dec', 2]]}
答案 2 :(得分:0)
假设您从数据框开始:
In [5]: df
Out[5]:
created moans thanks
0 2016-12-16 0 0
1 2016-12-17 0 0
2 2016-12-18 0 0
3 2016-12-19 0 2
4 2016-12-20 6 0
5 2016-12-21 0 0
6 2016-12-22 0 2
最简单的方法是将索引设置为'created'
,然后使用to_dict
:
In [8]: d = df.set_index('created').to_dict()
In [9]: d
Out[9]:
{'moans': {Timestamp('2016-12-16 00:00:00'): 0,
Timestamp('2016-12-17 00:00:00'): 0,
Timestamp('2016-12-18 00:00:00'): 0,
Timestamp('2016-12-19 00:00:00'): 0,
Timestamp('2016-12-20 00:00:00'): 6,
Timestamp('2016-12-21 00:00:00'): 0,
Timestamp('2016-12-22 00:00:00'): 0},
'thanks': {Timestamp('2016-12-16 00:00:00'): 0,
Timestamp('2016-12-17 00:00:00'): 0,
Timestamp('2016-12-18 00:00:00'): 0,
Timestamp('2016-12-19 00:00:00'): 2,
Timestamp('2016-12-20 00:00:00'): 0,
Timestamp('2016-12-21 00:00:00'): 0,
Timestamp('2016-12-22 00:00:00'): 2}}
如果您不想要词典,您可以随时执行以下操作:
In [11]: d = {k:sorted(v.items()) for k,v in d.items()}
In [12]: d
Out[12]:
{'moans': [(Timestamp('2016-12-16 00:00:00'), 0),
(Timestamp('2016-12-17 00:00:00'), 0),
(Timestamp('2016-12-18 00:00:00'), 0),
(Timestamp('2016-12-19 00:00:00'), 0),
(Timestamp('2016-12-20 00:00:00'), 6),
(Timestamp('2016-12-21 00:00:00'), 0),
(Timestamp('2016-12-22 00:00:00'), 0)],
'thanks': [(Timestamp('2016-12-16 00:00:00'), 0),
(Timestamp('2016-12-17 00:00:00'), 0),
(Timestamp('2016-12-18 00:00:00'), 0),
(Timestamp('2016-12-19 00:00:00'), 2),
(Timestamp('2016-12-20 00:00:00'), 0),
(Timestamp('2016-12-21 00:00:00'), 0),
(Timestamp('2016-12-22 00:00:00'), 2)]}
如果你坚持使用字符串而不是Timestamp对象(一个错误的调用IMO):
In [13]: {k:[(str(t),e) for t,e in v] for k,v in d.items()}
Out[13]:
{'moans': [('2016-12-16 00:00:00', 0),
('2016-12-17 00:00:00', 0),
('2016-12-18 00:00:00', 0),
('2016-12-19 00:00:00', 0),
('2016-12-20 00:00:00', 6),
('2016-12-21 00:00:00', 0),
('2016-12-22 00:00:00', 0)],
'thanks': [('2016-12-16 00:00:00', 0),
('2016-12-17 00:00:00', 0),
('2016-12-18 00:00:00', 0),
('2016-12-19 00:00:00', 2),
('2016-12-20 00:00:00', 0),
('2016-12-21 00:00:00', 0),
('2016-12-22 00:00:00', 2)]}