在我的数据框中,我有“国家”列,我试图将该列值更改为“发达国家”和“发展中国家”。我的数据框如下:
countries age gender
1 India 21 Male
2 China 22 Female
3 USA 23 Male
4 UK 25 Male
我有以下两个数组:
developed = ['USA','UK']
developing = ['India', 'China']
我想将数组转换为以下数据帧:
countries age gender
1 developing 21 Male
2 developing 22 Female
3 developed 23 Male
4 developed 25 Male
我尝试了以下代码,但出现“ SettingWithCopyWarning”错误:
df[df['countries'].isin(developed)]['countries'] = 'developed'
我尝试了以下代码,但出现“ SettingWithCopyWarning”错误,并且我的jupyter笔记本被挂起:
for i, x in enumerate(df['countries']):
if x in developed:
df['countries'][i] = 'developed'
是更改列类别的另一种方法吗?
答案 0 :(得分:2)
使用np.where:
#!/bin/sh
# dump only schema "tmp"
# force quoted identifiers
# use sed to strip them
# [youstillneedtoremove the "CReate SCHEMA $SCH_NAME-stuff
DB_NAME="postgres"
pg_dump -Upostgres -n tmp --schema-only --quote-all-identifiers $DB_NAME \
| sed 's/"tmp"\.//g' > tmp_schema_stripped.sql
#EOF
您还可以使用DataFrame.loc:
import numpy as np
df['countries']=np.where(df['countries'].isin(developed),'developed','developing')
print(df)
countries age gender
1 developing 21 Male
2 developing 22 Female
3 developed 23 Male
4 developed 25 Male
答案 1 :(得分:0)
您可以尝试实现替换功能,但不会出现错误。
Updated_DataSet1 = data_set.replace("India", "Developing")
Updated_DataSet2 = Updated_DataSet1.replace("China","Developing")