我有一个字典,里面填充了我导入的两个文件中的数据,但有些数据是以nan形式出现的。如何使用nan删除数据?
我的代码是:
import matplotlib.pyplot as plt
from pandas.lib import Timestamp
import numpy as np
from datetime import datetime
import pandas as pd
import collections
orangebook = pd.read_csv('C:\Users\WEGWEIS_JAKE\Desktop\Work Programs\Code Files\products2.txt',sep='~', parse_dates=['Approval_Date'])
specificdrugs=pd.read_csv('C:\Users\WEGWEIS_JAKE\Desktop\Work Programs\Code Files\Drugs.txt',sep=',')
"""This is a dictionary that collects data from the .txt file
This dictionary has a key,value pair for every generic name with its corresponding approval date """
drugdict={}
for d in specificdrugs['Generic Name']:
drugdict.dropna()
drugdict[d]=orangebook[orangebook.Ingredient==d.upper()]['Approval_Date'].min()
我应该添加或删除此代码以确保字典中没有值为nan的键值对?
答案 0 :(得分:16)
from math import isnan
如果将nans存储为密钥:
# functional
clean_dict = filter(lambda k: not isnan(k), my_dict)
# dict comprehension
clean_dict = {k: my_dict[k] for k in my_dict if not isnan(k)}
如果将nans存储为值:
# functional
clean_dict = filter(lambda k: not isnan(my_dict[k]), my_dict)
# dict comprehension
clean_dict = {k: my_dict[k] for k in my_dict if not isnan(my_dict[k])}
答案 1 :(得分:2)
使用simplejson
import simplejson
clean_dict = simplejson.loads(simplejson.dumps(my_dict, ignore_nan=True))
## or depending on your needs
clean_dict = simplejson.loads(simplejson.dumps(my_dict, allow_nan=False))
答案 2 :(得分:1)
您应该进一步调查为什么NaN会首先到达那里,而不是尝试从字典中删除NaN。
在字典中使用NaN很困难,因为NaN不等于自己。
查看此信息以获取更多信息:NaNs as key in dictionaries
答案 3 :(得分:0)
知道老了,但这对我有用,很简单-预先阅读CSV即可删除NaN:
orangebook = pd.read_csv('C:\Users\WEGWEIS_JAKE\Desktop\Work Programs\Code Files\products2.txt',sep='~', parse_dates=['Approval_Date']).dropna()
我也想同时转换成字典:
orangebook = pd.read_csv('C:\Users\WEGWEIS_JAKE\Desktop\Work Programs\Code Files\products2.txt',sep='~', parse_dates=['Approval_Date']).dropna().to_dict()