熊猫根据类型替换nan

时间:2018-07-23 16:37:32

标签: python pandas dataframe nan

在DataFrane.to_csv中,我设法编写了使用{p>来删除nan值的csv文件

df = df.replace('None','')
df = df.replace('nan','')

但是我的问题是,使用这种方法是将每个nan值都替换为qoutes:''

是否可以根据类型替换nan值?

if the nan dataframe == int dont add qoutes
if str set to ''
if float set to 0.0

尝试了此代码,但失败

df['myStringColumn'].replace('None', '')

编辑:这是我的示例数据框

      aTest    Vendor     name     price    qty
 0    y        NewVend             21.20    nan
 1    y        OldMakes            11.20    3
 2    nan      nan        sample   9.20     1
 3    n        nan        make     nan      0

这是我的目标

'y','NewVend','',21.20,,     
'y','OldMakes','',11.20,3,
'','','sample',9.20,1,
'n','','make',0.0,0,

这是完整的脚本

dtype_dic= {'price': float, 'qty': float}
df = pd.read_excel(os.path.join(sys.path[0], d.get('csv')), dtype=str)
for col, col_type in dtype_dic.items():
    df[col] = df[col].astype(col_type)
df = df.replace('None','')
df = df.replace('nan','')
df.to_csv('test.csv', index=False, header=False, quotechar='"', quoting=csv.QUOTE_NONNUMERIC)

2 个答案:

答案 0 :(得分:2)

您可以使用select_dtypes选择所需类型的列,如果nan是np.nan,则可以使用fillna,它也适用于None

float_cols = df.select_dtypes(include=['float64']).columns
str_cols = df.select_dtypes(include=['object']).columns

df.loc[:, float_cols] = df.loc[:, float_cols].fillna(0)
df.loc[:, str_cols] = df.loc[:, str_cols].fillna('')

你得到

    aTest   Vendor      name    price   qty
0   y       NewVend             21.2    0.0
1   y       OldMakes            11.2    3.0
2                       sample  9.2     1.0
3   n                   make    0.0     0.0

答案 1 :(得分:1)

尝试一下!

data = pd.read_csv("dataset/pokemon.csv")
data.head(7)


#   Name    Type 1  Type 2  HP  Attack  Defense Sp. Atk Sp. Def Speed   Generation  Legendary
0   1   Bulbasaur   Grass   Poison  45  49  49  65  65  45  1   False
1   2   Ivysaur Grass   Poison  60  62  63  80  80  60  1   False
2   3   Venusaur    Grass   Poison  80  82  83  100 100 80  1   False
3   4   Mega Venusaur   Grass   Poison  80  100 123 122 120 80  1   False
4   5   Charmander  Fire    NaN 39  52  43  60  50  65  1   False
5   6   Charmeleon  Fire    NaN 58  64  58  80  65  80  1   False
6   7   Charizard   Fire    Flying  78  84  78  109 85  100 1   False

要了解字符串类型列:

types = list(data.iloc[0])

str_types=[]
i=0
for a in l1:
    if type(a) == str:
        str_types.append(i)
    i+=1 

print str_types 
[1, 2, 3] #columns with string values

将NaN替换为“”:

for a in str_types:
    data.iloc[:,a].fillna(" ",inplace =True)

data


#   Name    Type 1  Type 2  HP  Attack  Defense Sp. Atk Sp. Def Speed   Generation  Legendary
0   1   Bulbasaur   Grass   Poison  45  49  49  65  65  45  1   False
1   2   Ivysaur Grass   Poison  60  62  63  80  80  60  1   False
2   3   Venusaur    Grass   Poison  80  82  83  100 100 80  1   False
3   4   Mega Venusaur   Grass   Poison  80  100 123 122 120 80  1   False
4   5   Charmander  Fire        39  52  43  60  50  65  1   False
5   6   Charmeleon  Fire        58  64  58  80  65  80  1   False
6   7   Charizard   Fire    Flying  78  84  78  109 85  100 1   False
7   8   Mega Charizard X    Fire    Dragon  78  130 111 130 85  100 1   False
8   9   Mega Charizard Y    Fire    Flying  78  104 78  159 115 100 1   False
9   10  Squirtle    Water       44  48  65  50  64  43  1   False
10  11  Wartortle   Water       59  63  80  65  80  58  1   False