您好我有下表并希望重塑它:
嗨,我在Pandas数据框中有下表:
q_string q_visits q_date
0 nucleus 1790 2012-10-02 00:00:00
1 neuron 364 2012-10-02 00:00:00
2 current 280 2012-10-02 00:00:00
3 molecular 259 2012-10-02 00:00:00
4 stem 201 2012-10-02 00:00:00
我想将q_date作为列标题,q_string作为行标签,并在交叉单元格中使用q_visits。
在Pandas / Python中最好的方法是什么?
答案 0 :(得分:5)
这是pivot_table
:
>>> df.pivot_table(values='q_visits', cols='q_date', rows='q_string')
q_date 2012-10-02 00:00:00
q_string
current 280
molecular 259
neuron 364
nucleus 1790
stem 201
答案 1 :(得分:0)
pivot_table可以使用,但我使用了一个简易版本来提高可读性。
data = [['nucleus', 1790, '2012-10-01 00:00:00'],
['neuron', 364, '2012-10-02 00:00:00'],
['current', 280, '2012-10-02 00:00:00'],
['molecular', 259, '2012-10-02 00:00:00'],
['stem', 201, '2012-10-02 00:00:00']]
df = pd.DataFrame(data, columns=['q_string', 'q_visits', 'q_date'])
q_string q_visits q_date
0 nucleus 1790 2012-10-01 00:00:00
1 neuron 364 2012-10-02 00:00:00
2 current 280 2012-10-02 00:00:00
3 molecular 259 2012-10-02 00:00:00
4 stem 201 2012-10-02 00:00:00
将q_string和q_date分配给索引:
df.set_index(['q_string', 'q_date'], inplace=True)
索引现在看起来像这样:
MultiIndex(levels=[['current', 'molecular', 'neuron', 'nucleus', 'stem'],
['2012-10-01 00:00:00', '2012-10-02 00:00:00']],
labels=[[3, 2, 0, 1, 4], [0, 1, 1, 1, 1]],
names=['q_string', 'q_date'])`
q_string和q_date都是日期的索引,我们只是将它取消堆叠()以将q_date放入列中。
df.unstack()
q_visits
q_date 2012-10-01 00:00:00 2012-10-02 00:00:00
q_string
current NaN 280.0
molecular NaN 259.0
neuron NaN 364.0
nucleus 1790.0 NaN
stem NaN 201.0