我有一张包含以下数据的CSV:
Customer Age
A 10
B 53
C 20
D 2
E 55
F 12
为此,我正在使用Pandas库来阅读csv。我的问题是如何对Ages值进行分组以获得一个具有以下间隔的新列:
Customer Age Age_Interval
A 10 [0-10]
B 53 [50-60]
C 20 [10-20]
D 2 [0-10]
E 55 [50-60]
F 12 [10-20]
我该怎么做?
谢谢!
答案 0 :(得分:5)
我相信你需要cut
:
df['Age_Interval'] = pd.cut(df['Age'], bins=np.arange(0,110,10))
print (df)
Customer Age Age_Interval
0 A 10 (0, 10]
1 B 53 (50, 60]
2 C 20 (10, 20]
3 D 2 (0, 10]
4 E 55 (50, 60]
5 F 12 (10, 20]
b = np.arange(0,110,10)
l = [ "{0}-{1}".format(i, i + 10) for i in range(0, 100, 10)]
df['Age_Interval'] = pd.cut(df['Age'], bins=b, labels=l)
print (df)
Customer Age Age_Interval
0 A 10 0-10
1 B 53 50-60
2 C 20 10-20
3 D 2 0-10
4 E 55 50-60
5 F 12 10-20
编辑:
print (df)
Customer Age
0 A 10
1 B 53
2 C 20
3 D 2
4 E 55
5 F 12
6 G 0
b = np.arange(0,110,10)
l = [ "{0}-{1}".format(i, i + 10) for i in range(0, 100, 10)]
df['Age_Interval'] = pd.cut(df['Age'], bins=b, labels=l, include_lowest=True)
print (df)
Customer Age Age_Interval
0 A 10 0-10
1 B 53 50-60
2 C 20 10-20
3 D 2 0-10
4 E 55 50-60
5 F 12 10-20
6 G 0 0-10
答案 1 :(得分:0)
你可以试试这个
df['Age_Interval'] = pd.cut(df.Age, range(10,100,10), include_lowest=True)