以下是生成示例数据帧的一些代码:
fruits=pd.DataFrame()
fruits['month']=['jan','feb','feb','march','jan','april','april','june','march','march','june','april']
ind_mnth=fruits['month'].values
fruits['fruit']=['apple','orange','pear','orange','apple','pear','cherry','pear','orange','cherry','apple','cherry']
ind_fruit=fruits['fruit'].values
fruits['price']=[30,20,40,25,30 ,45,60,45,25,55,37,60]
fruits_grp = fruits.set_index([ind_mnth, ind_fruit],drop=False)
如何对这个多索引数据帧的行进行排序,使得每个外部索引(月份)下的内部索引(水果)按照自定义顺序进行排序,并且具有相同外部索引的行被分组在一起。< /p>
答案 0 :(得分:0)
一种方法是按照你想要的顺序创建一个categorical系列的水果列,然后fn main() -> Result<(), Box<dyn std::error::Error>> {
let input_path = match std::env::args_os().nth(1) {
Some(p) => p,
None => {
eprintln!("Usage: csvmem <path>");
std::process::exit(1);
}
};
let mut count = 0;
let rdr = csv::Reader::from_path(input_path)?;
for result in rdr.into_records() {
let _ = result?;
count += 1;
}
println!("{}", count);
Ok(())
}
每个级别都有一个set_index
,Multiindex.from_arrays
像你一样做了
sort_index
请注意,# custom order
ord_fruit = ['apple', 'pear', 'cherry', 'orange']
# create a ordered Categorical series for the fruits
f = pd.Categorical(fruits['fruit'], categories=ord_fruit, ordered=True)
# get month values, could also be a custom order same idea than above
m = fruits['month'].to_numpy()
# get the result
fruits_grp = fruits.set_index(pd.MultiIndex.from_arrays([m,f])).sort_index()
print(fruits_grp)
month fruit price
april pear april pear 45 # pear before cherry
cherry april cherry 60
cherry april cherry 60
feb pear feb pear 40
orange feb orange 20
jan apple jan apple 30
apple jan apple 30
june apple june apple 37
pear june pear 45
march cherry march cherry 55 # cherry before orange
orange march orange 25
orange march orange 25
将按照字母顺序对其他级别进行排序,如果您不希望这样,您可以为每个级别创建自己的顺序。