我有如下数据
ID U_30 U_90 U_180 Limit Type
1 -25 -10 10 100 A
2 -50 -10 -60 200 A
3 -25 0 10 50 B
4 25 10 40 80 A
5 -60 80 1000 100 A
6 -800 -10 10 600 B
7 -25 -10 10 400 B
8 -25 -10 10 300 A
从上面我想创建U_30_Score
,U_90_Score和U_180_Score,如下所示。
U_30_Score示例
if U_30 in between -100 to -90 U_30_Score = 30
if U_30 in between -90 to -80, U_30_Score = 60
if U_30 in between -80 to -70 U_30_Score = 90
if U_30 in between -70 to -60, U_30_Score = 120
if U_30 in between -60 to -50, U_30_Score = 150
if U_30 in between -50 to -40 U_30_Score = 180
if U_30 in between -40 to -30, U_30_Score = 210
if U_30 in between -30 to -20, U_30_Score = 240
if U_30 in between -20 to -10 U_30_Score = 270
if U_30 in between -10 to 0, U_30_Score = 300
if U_30 is less than -100, U_30_Score = 0
if U_30 in between 90 to 100 U_30_Score = 400
if U_30 in between 80 to 90, U_30_Score = 390
if U_30 in between 70 to 80 U_30_Score = 380
if U_30 in between 60 to 70, U_30_Score = 370
if U_30 in between 50 to 60, U_30_Score = 360
if U_30 in between 40 to 50 U_30_Score = 350
if U_30 in between 30 to 40, U_30_Score = 340
if U_30 in between 20 to 30, U_30_Score = 330
if U_30 in between 10 to 20 U_30_Score = 320
if U_30 in between 0 to 10, U_30_Score = 310
if U_30 is more than 100, U_30_Score = 410
必须分别对U_90和U_180应用上述条件才能创建U_90_Score和U_180_Score。
预期输出:
ID U_30 U_90 U_180 Limit Type U_30_Score U_90_Score U_180_Score
1 -25 -10 10 100 A 240 300 320
2 -50 -10 -60 200 A 180 300 150
3 -25 0 10 50 B 240 310 320
4 25 10 40 80 A 330 320 350
5 -60 80 1000 100 A 150 390 410
6 -800 -10 10 600 B 0 300 320
7 -25 -10 10 400 B 240 300 320
8 -25 -10 10 300 A 240 300 320
答案 0 :(得分:4)
使用np.r_
创建bins
和labels
,然后使用pd.cut
根据这些bins
对指定的列进行分类:
from functools import partial
f = partial(pd.cut, bins=np.r_[-np.inf, -100:110:10, np.inf],
labels=np.r_[0:330:30, 310:420:10], right=False)
c = df.filter(like='U_').columns
df[c + '_Score'] = df[c].apply(f)
结果:
ID U_30 U_90 U_180 Limit Type U_30_Score U_90_Score U_180_Score
0 1 -25 -10 10 100 A 240 300 320
1 2 -50 -10 -60 200 A 180 300 150
2 3 -25 0 10 50 B 240 310 320
3 4 25 10 40 80 A 330 320 350
4 5 -60 80 1000 100 A 150 390 410
5 6 -800 -10 10 600 B 0 300 320
6 7 -25 -10 10 400 B 240 300 320
7 8 -25 -10 10 300 A 240 300 320
答案 1 :(得分:2)
我建议np.digitize
会自动将您的数据分箱到一个分箱中,然后从那里乘以分数缩放比例。
func = np.vectorize(lambda x: x*30 if x<11 else x*10+200)
func(
np.digitize(
[-25,-50,-25,25,-60,-800,-25,-25], #input column here
np.concatenate( #define your variably sized bins
[
[-float('inf')],
np.arange(-100,110,10),
[float('inf')]
]
)
)-1
).clip(max=410)
输出:
array([240, 180, 240, 330, 150, 0, 240, 240], dtype=int64)
答案 2 :(得分:1)
if U_30 < 0:
U_30_Score = max(300 + 30*(U_30//10+1), 0)
else:
U_30_Score = min(310 + 10*(U_30//10), 410)
这可能有用,因为您的分数分布具有某种规则。 max和min函数使它不会超出范围。
编辑:输出(测试)
>>> def func(U_30):
... if U_30 < 0:
... U_30_Score = max(300 + 30*(U_30//10+1), 0)
... else:
... U_30_Score = min(310 + 10*(U_30//10), 410)
... print(f"U_30 : {U_30}, U_30_Score : {U_30_Score}")
...
>>> for num in range(-110, 110, 5):
... func(num)
...
U_30 : -110, U_30_Score : 0
U_30 : -105, U_30_Score : 0
U_30 : -100, U_30_Score : 30
U_30 : -95, U_30_Score : 30
U_30 : -90, U_30_Score : 60
U_30 : -85, U_30_Score : 60
U_30 : -80, U_30_Score : 90
U_30 : -75, U_30_Score : 90
U_30 : -70, U_30_Score : 120
U_30 : -65, U_30_Score : 120
U_30 : -60, U_30_Score : 150
U_30 : -55, U_30_Score : 150
U_30 : -50, U_30_Score : 180
U_30 : -45, U_30_Score : 180
U_30 : -40, U_30_Score : 210
U_30 : -35, U_30_Score : 210
U_30 : -30, U_30_Score : 240
U_30 : -25, U_30_Score : 240
U_30 : -20, U_30_Score : 270
U_30 : -15, U_30_Score : 270
U_30 : -10, U_30_Score : 300
U_30 : -5, U_30_Score : 300
U_30 : 0, U_30_Score : 310
U_30 : 5, U_30_Score : 310
U_30 : 10, U_30_Score : 320
U_30 : 15, U_30_Score : 320
U_30 : 20, U_30_Score : 330
U_30 : 25, U_30_Score : 330
U_30 : 30, U_30_Score : 340
U_30 : 35, U_30_Score : 340
U_30 : 40, U_30_Score : 350
U_30 : 45, U_30_Score : 350
U_30 : 50, U_30_Score : 360
U_30 : 55, U_30_Score : 360
U_30 : 60, U_30_Score : 370
U_30 : 65, U_30_Score : 370
U_30 : 70, U_30_Score : 380
U_30 : 75, U_30_Score : 380
U_30 : 80, U_30_Score : 390
U_30 : 85, U_30_Score : 390
U_30 : 90, U_30_Score : 400
U_30 : 95, U_30_Score : 400
U_30 : 100, U_30_Score : 410
U_30 : 105, U_30_Score : 410