熊猫为什么显示“?”代替NaN

时间:2018-11-23 07:10:48

标签: python pandas data-science

我正在学习熊猫,当我显示数据框时,它正在显示?而不是NaN。 为什么会这样?

代码:

import pandas as pd

url = "https://archive.ics.uci.edu/ml/machine-learning- 
databases/autos/imports-85.data"

df = pd.read_csv(url, header=None)
print(df.head())
headers = ["symboling", "normalized-losses", "make", "fuel-type", 
"aspiration",
"num-of-doors", "body-style", "drive-wheels", "engine-location",
"wheel-base", "length", "width", "height", "curb-weight",
"engine-type", "num-of-cylinders", "engine-size", "fuel-system",
"bore", "stroke", "compression-ratio", "hoursepower", "peak-rpm",
"city-mpg", "highway-mpg", "price"]

df.columns=headers

print(df.head(30))

2 个答案:

答案 0 :(得分:2)

数据中缺少由?表示的值,因此要进行转换,可以使用参数na_values,也可以使用read_csv中的names参数按列表添加列,因此进行分配不必要:

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data"

headers = ["symboling", "normalized-losses", "make", "fuel-type", "aspiration",
"num-of-doors", "body-style", "drive-wheels", "engine-location",
"wheel-base", "length", "width", "height", "curb-weight",
"engine-type", "num-of-cylinders", "engine-size", "fuel-system",
"bore", "stroke", "compression-ratio", "hoursepower", "peak-rpm",
"city-mpg", "highway-mpg", "price"]

df = pd.read_csv(url, header=None, names=headers, na_values='?')

print(df.head(10))

   symboling  normalized-losses         make fuel-type aspiration  \
0          3                NaN  alfa-romero       gas        std   
1          3                NaN  alfa-romero       gas        std   
2          1                NaN  alfa-romero       gas        std   
3          2              164.0         audi       gas        std   
4          2              164.0         audi       gas        std   
5          2                NaN         audi       gas        std   
6          1              158.0         audi       gas        std   
7          1                NaN         audi       gas        std   
8          1              158.0         audi       gas      turbo   
9          0                NaN         audi       gas      turbo   

  num-of-doors   body-style drive-wheels engine-location  wheel-base   ...     \
0          two  convertible          rwd           front        88.6   ...      
1          two  convertible          rwd           front        88.6   ...      
2          two    hatchback          rwd           front        94.5   ...      
3         four        sedan          fwd           front        99.8   ...      
4         four        sedan          4wd           front        99.4   ...      
5          two        sedan          fwd           front        99.8   ...      
6         four        sedan          fwd           front       105.8   ...      
7         four        wagon          fwd           front       105.8   ...      
8         four        sedan          fwd           front       105.8   ...      
9          two    hatchback          4wd           front        99.5   ...      

   engine-size  fuel-system  bore  stroke compression-ratio hoursepower  \
0          130         mpfi  3.47    2.68               9.0       111.0   
1          130         mpfi  3.47    2.68               9.0       111.0   
2          152         mpfi  2.68    3.47               9.0       154.0   
3          109         mpfi  3.19    3.40              10.0       102.0   
4          136         mpfi  3.19    3.40               8.0       115.0   
5          136         mpfi  3.19    3.40               8.5       110.0   
6          136         mpfi  3.19    3.40               8.5       110.0   
7          136         mpfi  3.19    3.40               8.5       110.0   
8          131         mpfi  3.13    3.40               8.3       140.0   
9          131         mpfi  3.13    3.40               7.0       160.0   

   peak-rpm city-mpg  highway-mpg    price  
0    5000.0       21           27  13495.0  
1    5000.0       21           27  16500.0  
2    5000.0       19           26  16500.0  
3    5500.0       24           30  13950.0  
4    5500.0       18           22  17450.0  
5    5500.0       19           25  15250.0  
6    5500.0       19           25  17710.0  
7    5500.0       19           25  18920.0  
8    5500.0       17           20  23875.0  
9    5500.0       16           22      NaN  

[10 rows x 26 columns]

此信息在这里:

https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.names

  
      
  1. 缺少属性值:(由“?”表示)
  2.   

答案 1 :(得分:0)

另一种解决方案:如果要在读取数据后将?替换为NaN,可以执行以下操作:

df_new = df.replace({'?':np.nan})