我正在使用regex=True
将字符串值替换为数值以进行分析。我没有错误,但是当我检查数据帧后,值保持不变。我也尝试使用df['international plan'].replace(['no', 'yes'], [0, 1], inplace = True)
df['voice mail plan'].replace(['yes', 'no'], [1,0], inplace = True)
df['churn'].replace(['False', 'True'], [0, 1], inplace = True)
并遇到了同样的问题。任何帮助深表感谢。我的笔记本的打印屏幕附在下面,原始代码如下。
{{1}}
Print screen from my Jupyter Notebook
弥
答案 0 :(得分:0)
根据您的Notebook屏幕截图,您的列值为“yes”,“no”,“True。”和“False”。包含它周围的空格,因此.replace()不起作用,剥离空格并将yes / no更改为1/0,如:
df['international plan'] = df['international plan'].apply(lambda x: 1 if x.strip() == "yes" else 0)
df['voice mail plan'] = df['voice mail plan'].apply(lambda x: 1 if x.strip() == "yes" else 0)
df['churn'] = df['churn'].apply(lambda x: 1 if x.strip() == "True." else 0)
答案 1 :(得分:0)
值中存在一些空格:
np.random.seed(789)
df = pd.DataFrame({'international plan': np.random.choice([' yes',' no'], size=5),
'voice mail plan': np.random.choice([' yes',' no'], size=5),
'churn': np.random.choice([' False.',' True.'], size=5),
'area code': np.random.choice([415,408], size=5)})
print (df)
area code churn international plan voice mail plan
0 408 True. no yes
1 415 False. yes yes
2 408 True. yes no
3 408 False. yes yes
4 408 False. no yes
apply
针对循环列cols
的解决方案,并dict
使用str.strip
和Series.replace
:
cols = ['international plan','voice mail plan','churn']
d = {'no':0,'yes':1, 'True.':1, 'False.':0}
df[cols] = df[cols].apply(lambda x: x.str.strip().replace(d))
print (df)
area code churn international plan voice mail plan
0 408 1 0 1
1 415 0 1 1
2 408 1 1 0
3 408 0 1 1
4 408 0 0 1
或者向dict
中的键添加空格,然后使用DataFrame.replace
:
cols = ['international plan','voice mail plan','churn']
d = {' no':0,' yes':1, ' True.':1, ' False.':0}
df[cols] = df[cols].replace(d)
如果想分别更换每一列:
df['international plan'] = df['international plan'].str.strip().replace(['no','yes'],[0, 1])
df['voice mail plan'] = df['voice mail plan'].str.strip().replace(['yes','no'],[1,0])
df['churn'] = df['churn'].str.strip().replace(['False.','True.'],[0, 1])
print (df)
area code churn international plan voice mail plan
0 408 1 0 1
1 415 0 1 1
2 408 1 1 0
3 408 0 1 1
4 408 0 0 1