Question

我的数据框看起来像这样-

    X_categorical          X_continous                         y_variable
0   Gender              Flight Distance                        satisfaction
1   Customer          Inflight wifi service                  
2   Age              Departure/Arrival time convenient       
3   Type of Travel     Ease of Online booking                     
4   Class                  Gate location                        
5                          Food and drink   
6                          Online boarding  
7                          Seat comfort 
8                        Inflight entertainment 
9                           On-board service    
10                      Leg room service    
11                       Baggage handling   
12                           Checkin service    
13                           Inflight service   
14                            Cleanliness   
15                       Departure Delay in Minutes 
16                        Arrival Delay in Minutes  
17                                 id

现在当我这样做

X_categorical=input_variables['X_categorical'].values
X_categorical = X_categorical.tolist()

我明白了-

['Gender',
 'Customer',
 'Age',
 'Type of Travel',
 'Class',
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan]

如何删除Nan部分？

Answer 1

X_categorical=df['X_categorical'].dropna()
X_categorical.tolist()

输出：

['Gender', 'Customer', 'Age', 'Type of Travel', 'Class']

熊猫column.values提供额外的nan

1 个答案: