Question

我使用Numpy'arange'函数创建了一系列bin：

bins = np.arange(0, df['eCPM'].max(), 0.1)

输出如下：

[1.8, 1.9)          145940.67         52.569295   1.842306  
[1.9, 2)            150356.59         54.159954   1.932365  
[10.6, 10.7)        150980.84         54.384815  10.626436  
[13.3, 13.4)        152038.63         54.765842  13.373157  
[2, 2.1)            171494.11         61.773901   2.033192  
[2.1, 2.2)          178196.65         64.188223   2.141412  
[2.2, 2.3)          186259.13         67.092410   2.264005

我如何才能将[10. 6, 10.7]和[13.3, 13.4]这些邮箱放到他们所属的位置，以便所有垃圾箱按升序排列？

我假设这些垃圾箱被读作字符串因此这个问题。我尝试添加dtype：bins = ..., 0.1, dtype=float)，但没有运气。

[编辑]

import numpy as np
import pandas
df = pandas.read_csv('path/to/file', skip_footer=1)
bins = np.arange(0, df1['eCPM'].max(), 0.1, dtype=float)
df['ecpm group'] = pandas.cut(df['eCPM'], bins, right=False, labels=None)
df =df[['ecpm group', 'Imps', 'Revenue']].groupby('ecpm group').sum()

Answer 1

您可以在＆＃34;人类订单＆＃34;中对索引进行排序然后重新索引：

import numpy as np
import pandas as pd
import re

def natural_keys(text):
    '''
    alist.sort(key=natural_keys) sorts in human order
    http://nedbatchelder.com/blog/200712/human_sorting.html
    (See Toothy's implementation in the comments)
    '''
    def atoi(text):
        return int(text) if text.isdigit() else text

    return [atoi(c) for c in re.split('(\d+)', text)]

# df = pandas.read_csv('path/to/file', skip_footer=1)
df = pd.DataFrame({'eCPM': np.random.randint(20, size=40)})
bins = np.arange(0, df['eCPM'].max()+1, 0.1, dtype=float)
df['ecpm group'] = pd.cut(df['eCPM'], bins, right=False, labels=None)
df = df.groupby('ecpm group').sum()
df = df.reindex(index=sorted(df.index, key=natural_keys))
print(df)

产量

            eCPM
[0, 0.1)       0
[1, 1.1)       5
[2, 2.1)       4
[4, 4.1)      12
[6, 6.1)      24
[7, 7.1)       7
[8, 8.1)      16
[9, 9.1)      45
[10, 10.1)    40
[11, 11.1)    11
[12, 12.1)    12
[13, 13.1)    13
[15, 15.1)    15
[16, 16.1)    64
[17, 17.1)    34
[18, 18.1)    18

Python Numpy arange bin应该按升序显示

1 个答案: