我的数据集包含基于GDP创建的不同组和归一化值。归一化列的每个组的比例在0到1之间。我想重新缩放此值,例如python中的A组(0至10),B组(11至30),C组(31至50)和D组(51至100)。
这是数据集:
Year GDP Group Normalised
1970 84684 A 0.000000
1971 95806 A 0.029729
1972 106868 A 0.059298
1973 120720 A 0.096325
1974 139760 A 0.147219
1975 160477 A 0.202595
1976 182173 A 0.260589
1977 205919 A 0.324062
1978 222396 A 0.368106
1979 237848 A 0.409409
1980 264619 A 0.480968
1981 301452 A 0.579423
1982 336748 A 0.673770
1983 370430 A 0.763802
1984 409163 A 0.867336
1985 458794 A 1.000000
1986 515505 B 0.000000
1987 571608 B 0.130155
1988 606744 B 0.211669
1989 619600 B 0.241494
1990 639732 B 0.288199
1991 670016 B 0.358456
1992 697418 B 0.422027
1993 731043 B 0.500035
1994 769888 B 0.590153
1995 828636 B 0.726445
1996 876411 B 0.837280
1997 946551 B 1.000000
1998 1020061 C 0.000000
1999 1074489 C 0.057531
2000 1144839 C 0.131892
2001 1211783 C 0.202653
2002 1258692 C 0.252236
2003 1308153 C 0.304517
2004 1407892 C 0.409943
2005 1514364 C 0.522485
2006 1661699 C 0.678221
2007 1830997 C 0.857171
2008 1946700 C 0.979471
2009 1966122 C 1.000000
2010 2077604 D 0.000000
2011 2161617 D 0.116603
2012 2298445 D 0.306508
2013 2423242 D 0.479716
2014 2539596 D 0.641205
2015 2621032 D 0.754231
2016 2712752 D 0.881530
2017 2798110 D 1.000000
我真的需要你的帮助。谢谢。
答案 0 :(得分:-1)
我将使用np.select(),如下所示。
import numpy as np
a0 = [0, 11, 31, 51]
a1 = [10, 19, 19, 49]
conditions = [
(df['Group'] == 'A'),
(df['Group'] == 'B'),
(df['Group'] == 'C'),
(df['Group'] == 'D')
]
df['a0'] = np.select(conditions, a0)
df['a1'] = np.select(conditions, a1)
df['Rescaled'] = df['Normalised'].multiply(df['a1']).add(df['a0'])