python 2等同于带有熊猫df的get_dummies

时间:2018-07-10 01:59:37

标签: python pandas jupyter-notebook

我正在设法弄清为什么我的代码无法使用以下示例数据使用dummie值访问特定的列值:

df

            shop   category  subcategory     season
date                
2013-09-04  abc    weddings  shoes           winter
2013-09-04  def    jewelry   watches         summer
2013-09-05  ghi    sports    sneakers        spring
2013-09-05  jkl    jewelry   necklaces       fall

这是我的基本代码:

wedding_df = df[["weddings","winter","summer","spring","fall"]]

我在笔记本电脑上使用Python 2,因此很可能是版本问题,需要get_dummies(),但是一些指导会有所帮助。想法是创建一个虚拟数据框,该数据框使用二进制数据来说明某行是否具有婚礼类别以及什么季节。

这是我要实现的目标的一个示例:

        weddings    winter  summer  spring  fall
71654   1.0         0.0     1.0     0.0     0.0
72168   1.0         0.0     1.0     0.0     0.0
72080   1.0         0.0     1.0     0.0     0.0

corr()

         weddings   fall     spring    summer      winter
weddings NaN        NaN      NaN        NaN        NaN
fall     NaN       1.000000  0.054019   -0.331866   -0.012122
spring   NaN       0.054019  1.000000   -0.857205   0.072420
summer   NaN       -0.331866 -0.857205  1.000000    -0.484578
winter   NaN       -0.012122 0.072420   -0.484578   1.000000

1 个答案:

答案 0 :(得分:1)

您可以尝试使用prefix并将prefix_sep分配为blank,然后您就可以df[["weddings","winter","summer","spring","fall"]]

df = pd.get_dummies(df,prefix = '', prefix_sep = '' )
df
            abc  def  ghi  jkl  jewelry  sports  weddings  necklaces  shoes  \
date                                                                          
2013-09-04    1    0    0    0        0       0         1          0      1   
2013-09-04    0    1    0    0        1       0         0          0      0   
2013-09-05    0    0    1    0        0       1         0          0      0   
2013-09-05    0    0    0    1        1       0         0          1      0   
            sneakers  watches  fall  spring  summer  winter  
date                                                         
2013-09-04         0        0     0       0       0       1  
2013-09-04         0        1     0       0       1       0  
2013-09-05         1        0     0       1       0       0  
2013-09-05         0        0     1       0       0       0  

更新

pd.get_dummies(df.loc[df['category']=='weddings',['category','season']],prefix = '', prefix_sep = '' )
Out[820]: 
            weddings  winter
date                        
2013-09-04         1       1