在熊猫' print
incremented value is 3
Returning 42
t3 returned 42
方法,sort_values
参数仅在对单个列或标签进行排序时应用。为什么这样,以及在未应用kind
参数的情况下使用什么排序算法?它是稳定的吗?
(有关文档,请参阅https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_values.html。)
答案 0 :(得分:4)
这是docstring from the source file,声明get_group_index_sorter(group_index, ngroups)
:
algos.groupsort_indexer implements `counting sort` and it is at least O(ngroups), where ngroups = prod(shape) shape = map(len, keys) that is, linear in the number of combinations (cartesian product) of unique values of groupby keys. This can be huge when doing multi-key groupby. np.argsort(kind='mergesort') is O(count x log(count)) where count is the length of the data-frame;
Both algorithms are `stable` sort and that is necessary for correctness of
groupby operations. e.g. consider: df.groupby(key)[col].transform('first')
PS这里是一个“调用链”:
pandas.core.frame.DataFrame.sort_values() -> \
pandas.core.sorting.lexsort_indexer() -> \
pandas.core.sorting.indexer_from_factorized() -> \
pandas.core.sorting.get_group_index_sorter()