我有一个数据集,其中收入是许多变量之一。我想立即在收入变量右边添加一列,即z得分。我知道这里有一个问题,关于如何对除一列或多列之外的所有列执行此操作,但是我需要对一列执行此操作,而不用替换值。这可能是很长的路要走,但是我只提取了收入列,然后将z得分应用于它。但是,我无法弄清楚如何重命名“ Norm_Income”列,然后将其放回到收入旁边的主数据框中。任何帮助是极大的赞赏。这是我所拥有的(我知道不多):
## HW Part 3: Standardizing Income Attribute with Z-Score Normalization
Income=pd.DataFrame(bank_df,columns=['income'])
from scipy.stats import zscore
Norm_Income=Income.apply(zscore)
Norm_Income
编辑:这太奇怪了:昨晚进行这项工作,但是现在我得到一个错误。这是我的代码:
## HW Part 3: Standardizing Income Attribute with Z-Score Normalization Income=pd.DataFrame(bank_df,columns=['income'])
from scipy.stats import zscore
Income["Norm_Income"] = Income.apply(zscore) bank_df=bank_df[["id","age","income","Norm_Income","children","gender","region","married","car","savings_acct","current_acct","mortgage","pep"]]
bank_df
这是新错误:
答案 0 :(得分:0)
您已经有一个系列,因此将其放入数据框中非常简单,请看一下Adding new column to existing DataFrame in Python pandas
您只需要:
Income["Norm_Income"] = Income.apply(zscore)
代替您的第三行
答案 1 :(得分:0)
因此,请忽略我对答案的评论。我想通了代码,在我的问题的情况下工作。
## HW Part 3: Standardizing Income Attribute with Z-Score Normalization
Income=pd.DataFrame(bank_df,columns=['income'])
from scipy.stats import zscore
bank_df["norm_income"] = Income.apply(zscore)
bank_df["norm_income"]
bank_df=bank_df[["id","age","income","norm_income","children","gender","region","married","car","savings_acct","current_acct","mortgage","pep"]]
bank_df