我有一个数据框,其中有一列 bmi 基于该列我想创建另一列,该列将显示相对于该行 bmi 值的 bmi 范围。下面是我的代码:
for i in range(df["bmi"].count()):
if df["bmi"][i] < 18.5:
df["bmi_category"] = "Under Weight"
elif 25 > df["bmi"][i] >= 18.5:
df["bmi_category"] = "Healthy Weight"
elif 30 > df["bmi"][i] >= 25:
df["bmi_category"] = "Overweight"
elif df["bmi"][i] >= 30:
df["bmi_category"] = "Obese"
但是当我运行这段代码时,我收到了这个错误。
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3079 try:
-> 3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 228
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-220-e7569ff34eec> in <module>
1 for i in range(cardio["bmi"].count()):
----> 2 if cardio["bmi"][i] < 18.5:
3 cardio["bmi_category"] = "Under Weight"
4 elif 25 > cardio["bmi"][i] >= 18.5:
5 cardio["bmi_category"] = "Healthy Weight"
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
849
850 elif key_is_scalar:
--> 851 return self._get_value(key)
852
853 if is_hashable(key):
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
957
958 # Similar to Index.get_value, but we do not fall back to positional
--> 959 loc = self.index.get_loc(label)
960 return self.index._get_values_for_loc(self, loc, label)
961
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:
-> 3082 raise KeyError(key) from err
3083
3084 if tolerance is not None:
KeyError: 228
谁能告诉我我在这里做错了什么?以及如何解决这个问题?
答案 0 :(得分:2)
以下将 bmi
列中的值映射到 bmi_category
列中的值
def get_category(bmi):
if not bmi:
return None
if bmi < 18.5:
return "Under Weight"
if bmi < 25:
return "Healthy Weight"
if bmi < 30:
return "Overweight"
return "Obese"
df['bmi_category'] = df['bmi'].apply(get_category)
附言如果您发现自己在一个数据帧上进行迭代,那么几乎总有一个函数可以更快、更干净地完成它。
答案 1 :(得分:1)
您可以使用 pd.cut
有效地执行此操作。
df = pd.DataFrame(np.random.randint(16,35,(50,1)), columns=["bmi"])
df['bmi_category'] = pd.cut(df['bmi'], [0, 18.5, 25, 30, np.infty], labels=["Under Weight", "Healthy Weight", "Overweight", "Obese"], right=False)