我想用pandas来翻译我的CSV文件。在此之前,我的CSV文件是四个文件,而不是一个。所以这次我结合四个文件并删除som列,在此之后,我使用我的代码进行翻译。但是得到了一个错误。
我的数据:
count birdName date location time
15 bird1 1990-02-10 balabala 0900:1200
10 bird2 1990-02-10 balabala 0900:1200
20 bird3 1990-02-10 balabala 0900:1200
40 bird4 1990-02-28 balabala 1300:1500
10 bird5 1990-02-28 balabala 1300:1500
25 bird6 1990-02-28 balabala 1300:1500
45 bird7 1990-03-01 balabala 0900-1200
15 bird8 1990-03-01 balabala 0900-1200
30 bird9 1990-03-01 balabala 0900-1200
... ... ... ... ...
我想要的格式:
date time location birdName count birdName count birdName count
1990-02-10 0900:1200 balabala bird1 15 bird2 10 bird3 20
1990-02-28 1300:1500 balabala bird4 40 bird5 10 bird6 25
1990-03-01 0900-1200 balabala bird7 45 bird8 15 bird9 30
... ... ... ... ... ... ... ... ...
用我的代码:
# -*- coding: utf-8 -*-
import pandas as pd
df = pd.read_csv('./birds.csv')
common = ['date','time']
grouped = df.groupby(common)
df['idx'] = grouped.cumcount()
df2 = df.set_index(['idx']+common+['location'])
df2 = df2.unstack('idx')
df2 = df2.swaplevel(0, 1, axis=1)
df2 = df2.sortlevel(axis=1)
df2.columns = df2.columns.droplevel(0)
df2 = df2.reset_index()
df2.to_csv('./birdsIwant.csv')
但得到了一个错误:
Traceback (most recent call last):
File "D:\python27\allentest\birdsIwant .py", line 9, in <module>
df2 = df2.unstack('idx')
......
File "D:\python27\lib\site-packages\pandas\core\reshape.py", line 206, in
get_new_values
new_values = np.empty(result_shape, dtype=dtype)
MemoryError
我的csv文件链接:https://drive.google.com/open?id=0B6SUWnrBmDwSQ0p3NlRLR0FXWjA&authuser=0
有人可以教我如何解决?感谢