我需要按体重分组熊猫
Name weight(kg)
Person1 4.44
Person2 37.3
Person3 36.38
Person4 39.52
Person5 81.57
Person6 43.55
Person7 91.11
Person8 5
Person9 36.48
Person10 38.25
我的代码如下。需要根据条件分组。我的代码如下。但是所有行的数值都为0到20。
if 0 <= data_file['weight(kg)'].all() < 20:
data_file['target'] = "0 to 20%"
if 20 < data_file['weight(kg)'].all() < 40:
data_file['target'] = "20 to 40%"
if 40 < data_file['weight(kg)'].all() < 60:
data_file['target'] = "40 to 60%"
if 60 < data_file['weight(kg)'].all() < 80:
data_file['target'] = "60 to 80%"
if 80 < data_file['weight(kg)'].all() <= 100:
data_file['target'] = "80 to 100%"
预期
Name weight(kg) Target
Person1 4.44 0 to 20
Person2 37.3 20 to 40
Person3 36.38 20 to 40
Person4 39.52 20 to 40
Person5 81.57 80 to 100
Person6 43.55 40 to 60
Person7 91.11 80 to 100
Person8 5 0 to 20
Person9 36.48 20 to 40
Person10 38.25 20 to 40
答案 0 :(得分:6)
使用pd.cut
df.assign(bins = pd.cut(df["weight(kg)"], [0,20,40,60,80,100],
labels=['0 to 20', '20 to 40', '40 to 60', '60 to 80', '80 to 100']))
结果
Name weight(kg) bins
0 Person1 4.44 0 to 20
1 Person2 37.30 20 to 40
2 Person3 36.38 20 to 40
3 Person4 39.52 20 to 40
4 Person5 81.57 80 to 100
5 Person6 43.55 40 to 60
6 Person7 91.11 80 to 100
7 Person8 5.00 0 to 20
8 Person9 36.48 20 to 40
9 Person10 38.25 20 to 40
答案 1 :(得分:3)
非常简单,只需尝试在熊猫和lambda函数中使用Apply:
def classify(x):
if 0 <= x < 20:
y = "0 to 20%"
if 20 < x < 40:
y = "20 to 40%"
if 40 < x < 60:
y = "40 to 60%"
if 60 < x < 80:
y = "60 to 80%"
if 80 < x <= 100:
y = "80 to 100%"
return y
假设您的数据框具有两列“名称”和“重量”,我们应该输入:
df['Target'] = df['weight'].apply(lambda x: classify(x))
我希望对您有帮助
额外: 如果需要进度条,可以添加以下行:
from tqdm import tqdm
tqdm.pandas()
df['Target'] = df['weight'].progress_apply(lambda x: classify(x))
答案 2 :(得分:3)
您可以使用np.select
conditions = [
(0 <= df['weight(kg)']) & (df['weight(kg)'] < 20)
, (20 < df['weight(kg)']) & (df['weight(kg)'] < 40)
, (40 < df['weight(kg)']) & (df['weight(kg)'] < 60)
, (60 < df['weight(kg)']) & (df['weight(kg)'] < 80)
, (80 < df['weight(kg)']) & (df['weight(kg)'] <= 100)
]
results = [
"0 to 20%"
,"20 to 40%"
,"40 to 60%"
,"60 to 80%"
,"80 to 100%"
]
df['Target'] = np.select(conditions, results)
输出:
Name weight(kg) Target
0 Person1 4.44 0 to 20%
1 Person2 37.30 20 to 40%
2 Person3 36.38 20 to 40%
3 Person4 39.52 20 to 40%
4 Person5 81.57 80 to 100%