我在熊猫中有一个像这样的数据框
Year Month Month_index Rainfall
1900 Jan 1 4.8
1901 Jan 1 5
.
.
.
1900 Feb 2 3.2
1901 Feb 2 4.3
.
.
要使用日期时间索引,我需要将其重新排列为-
Year Month Month_index Rainfall
1900 Jan 1 4.8
1900 Feb 2 3.2
.
.
1901 Jan 1 5
1901 Feb 2 4.3
.
.
当然,整整12个月我只是为了简洁起见。我是python的新手,所以我不知道是否有命令可以执行此操作。 先感谢您!
edit:这是到目前为止我正在使用的代码-
import csv
#import pyexcel-io as pi
import numpy as np
import pandas as pd
import dateutil
#Read data from csv file into dataframe
df =pd.read_csv('/Users/Gingeraffe/Documents/University/3rd_year/Bureau_Research/Notebooks/Data/rainfall_SW_WA.csv')
months = df.columns[1:]
#Melt is putting months down a column and the data down another column.
Problem is ' jan jan jan... feb feb feb..' etc. instead of 'jan feb mar.. etc'
df = pd.melt(df, id_vars='Year', value_vars=months, var_name='Month')
df.insert(2,'Month_index',0)
M = {'Jan':1, 'Feb':2, 'Mar':3, 'Apr':4, 'May':5, 'June':6, 'July':7, 'Aug':8, 'Sep':9, 'Oct':10, 'Nov':11, 'Dec':12}
df.Month_index = df.Month.map(M)
答案 0 :(得分:2)
使用您的三列创建一个datetime
系列,然后我们可以将其用于排序:
s = pd.to_datetime(
df[['Year', 'Month']].astype(str).sum(1), format='%Y%b'
)
最后,排序:
df.iloc[s.sort_values().index]
Year Month Month_index Rainfall
0 1900 Jan 1 4.8
2 1900 Feb 2 3.2
1 1901 Jan 1 5.0
3 1901 Feb 2 4.3
答案 1 :(得分:0)
import pandas as pd
d = {'Year' : [1900,1901,1900,1901], 'Month' : ['Jan','Jan','Feb','Feb'] , 'Month_index' : [1,1,2,2], 'Rainfall' : [4.8,5,3.2,4.3]}
df = pd.DataFrame(data=d)
df = df.sort(['Year','Month_index'])
#dataframe <df> should now contain the sorted dataframe