Question

我有一个熊猫系列：

import numpy as np
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = pd.Series(np.random.randn(8), index=index)
s
Out[3]: 
first  second
bar    one      -1.111475
       two      -0.644368
baz    one       0.027621
       two       0.130411
foo    one      -0.942718
       two      -1.335731
qux    one       1.277417
       two      -0.242090
dtype: float64

如何按每个组中的值对该系列进行排序？

例如，qux组的第一行应为-0.242090，第二行应为1.277417。组栏排序良好，因为-1.111475低于-0.644368。

我需要诸如s.groupby（level = 0）.sort_values（）之类的东西。

Answer 1

使用sort_values：

{
    "animals": [
        {
            "name": "lion",
            "countries": [
                {
                    "name": "kenya",
                    "facts": [
                        {
                            "features": [
                                "young male"
                            ],
                            "age": "2y",
                            "id": "2837492"
                        }
                    ]
                },
                {
                    "name": "tanzania",
                    "facts": [

                    ]
                },
                {
                    "name": "south africa",
                    "facts": [
                        {
                            "features": [
                                "adult lioness"
                            ],
                            "age": "10y",
                            "id": "495684576"
                        }
                    ]
                }
            ]
        },
        {
            "name": "giraffe",
            "countries": [
                {
                    "name": "zambia",
                    "facts": [
                        {
                            "features": [
                                "ex captivity"
                            ],
                            "age": "20y",
                            "id": "343453509"
                        }
                    ]
                },
                {
                    "name": "kenya",
                    "facts": [
                        {
                            "features": [
                                "male"
                            ],
                            "age": "17y",
                            "id": "85604586"
                        }
                    ]
                },
                {
                    "name": "uganda",
                    "facts": [
                        {
                            "features": [
                                "none"
                            ],
                            "age": "11y",
                            "id": "39860394758936764"
                        }
                    ]
                }
            ]
        }
    ]
}

np.random.seed(0)
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = pd.Series(np.random.randn(8), index=index)

Answer 2

您可以使用np.lexsort根据您的第一个索引级别对 first 进行排序，并根据值对 second 进行排序。

np.random.seed(0)
s = pd.Series(np.random.randn(8), index=index)

s = s.iloc[np.lexsort((s.values, s.index.get_level_values(0)))]

print(s)

# first  second
# bar    two       0.400157
#        one       1.764052
# baz    one       0.978738
#        two       2.240893
# foo    two      -0.977278
#        one       1.867558
# qux    two      -0.151357
#        one       0.950088
# dtype: float64

如何按索引级别和值对分组的多索引熊猫系列进行排序？

2 个答案: