我正在尝试为分类变量创建虚拟变量。但是当我创建它们时,我得到的是ValueError:列重叠但没有指定后缀'。这是代码:
dummy2 = pd.get_dummies(data['Teaching'], prefix='Teach')
dummy2.head ()
dummy2.columns = ['Small/Rural','Teaching']
data = data.join(dummy2)
##################
dummy3 = pd.get_dummies(data['Gender'], prefix='Gender_')
dummy3.head()
dummy3.columns = ['Male','Female']
data = data.join(dummy3)
#####################
dummy4 = pd.get_dummies(data['PositionTitle'], prefix='pos_')
dummy4.head()
dummy4.columns = ['Acting Director','RegioReresentative']
data = data.join(dummy4)
#####################
dummy5 = pd.get_dummies(data['Compensation'], prefix='COMP')
dummy5.head()
dummy5.columns = ['23987','46978','89473','248904']
data = data.join(dummy5)
#################3
dummy6 = pd.get_dummies(data['TypeControl'], prefix='Type')
dummy6.head()
dummy6.columns = ['City/country','District','Investor','Non Profit']
data = data.join(dummy6)
答案 0 :(得分:0)
有关如何使用位于以下位置的pd.concat进行此操作的很好的解释 https://towardsdatascience.com/the-dummys-guide-to-creating-dummy-variables-f21faddb1d40。对此示例进行修改如下所示:
dummy2 = pd.get_dummies(data['Teaching'], prefix='Teach')
data = pd.concat([data, dummy2, axis = 1)