保存pandas数据帧但保留NA值

时间:2016-04-04 14:51:36

标签: python csv pandas nan na

我有这段代码

Foo('me@domain.com', 'Silly', 'Walks', 'Spam', 'Eggs', 'Ni', 'File1.txt', 'File2.txt')

我用NA值改变了重量:

import json
json_data = json.loads(your_string_above)
ids = json_data.keys()
# ids now contains [u'1', u'2', u'4', u'5', u'6']

最后我保存了它

import pandas as pd
import numpy as np
import csv
df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
               'size': list('SSMMMLL'),
               'weight': [8, 10, 11, 1, 20, 12, 12],
               'adult' : [False] * 5 + [True] * 2}); 

但是当我读到文件时,我有“”,而不是NA 我想把NA代替Nan

我想要输出:

df['weight'] = np.nan

2 个答案:

答案 0 :(得分:10)

如果您希望字符串代表NaN值,请将na_rep传递给to_csv

In [8]:
df.to_csv(na_rep='NA')

Out[8]:
',adult,animal,size,weight\n0,False,cat,S,NA\n1,False,dog,S,NA\n2,False,cat,M,NA\n3,False,fish,M,NA\n4,False,dog,M,NA\n5,True,cat,L,NA\n6,True,cat,L,NA\n'

如果您希望NA在引号中,请转义引号:

In [3]:
df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
               'size': list('SSMMMLL'),
               'weight': [8, 10, 11, 1, 20, 12, 12],
               'adult' : [False] * 5 + [True] * 2})
df['weight'] = np.NaN
df.to_csv(na_rep='\'NA\'')

Out[3]:
",adult,animal,size,weight\n0,False,cat,S,'NA'\n1,False,dog,S,'NA'\n2,False,cat,M,'NA'\n3,False,fish,M,'NA'\n4,False,dog,M,'NA'\n5,True,cat,L,'NA'\n6,True,cat,L,'NA'\n"

修改

要获得所需的输出,请使用以下参数:

In [27]:
df.to_csv(na_rep='NA', sep=';', index=False,quoting=3)
​
Out[27]:
'adult;animal;size;weight\nFalse;cat;S;NA\nFalse;dog;S;NA\nFalse;cat;M;NA\nFalse;fish;M;NA\nFalse;dog;M;NA\nTrue;cat;L;NA\nTrue;cat;L;NA\n'

答案 1 :(得分:4)

To get that specific output, you'll have to pass the quotes in explicitly.

df = pd.DataFrame({'animal': r'"cat" "dog" "cat" "fish" "dog" "cat" "cat"'.split(),
           'size': list(r'"S" "S" "M" "M" "M" "L" "L"'.split()),
           'weight': [8, 10, 11, 1, 20, 12, 12],
           'adult' : [False] * 5 + [True] * 2}); 
df['weight'] = '%s' %('NA')
df.to_csv("ejemplo.csv", sep=';', decimal=',',quoting=csv.QUOTE_NONE, index=False)