TypeError:输入类型不支持ufunc'isnan',并且不能安全地强制输入

时间:2018-10-05 01:45:10

标签: python pandas numpy numpy-ufunc

我正在尝试将csv转换为numpy数组。在numpy数组中,我用NaN替换了一些元素。然后,我想在numpy数组中找到NaN元素的索引。代码是:

    import pandas as pd
import matplotlib.pyplot as plyt
import numpy as np

filename = 'wether.csv'

df = pd.read_csv(filename,header = None )

list = df.values.tolist()
labels = list[0]
wether_list = list[1:]

year = []
month = []
day = []
max_temp = []

for i in wether_list:
    year.append(i[1])
    month.append(i[2])
    day.append(i[3])
    max_temp.append(i[5])

mid = len(max_temp) // 2
temps = np.array(max_temp[mid:])
temps[np.where(np.array(temps) == -99.9)] = np.nan
plyt.plot(temps,marker = '.',color = 'black',linestyle = 'none')
# plyt.show()

print(np.where(np.isnan(temps))[0])
# print(len(pd.isnull(np.array(temps))))

执行此操作时,我收到警告和错误。警告是:

    wether.py:26: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  temps[np.where(np.array(temps) == -99.9)] = np.nan

错误是:

Traceback (most recent call last):
  File "wether.py", line 30, in <module>
    print(np.where(np.isnan(temps))[0])
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

这是我正在使用的数据集的一部分:

83168,2014,9,7,0.00000,89.00000,78.00000, 83.50000
83168,2014,9,22,1.62000,90.00000,72.00000, 81.00000
83168,2014,9,23,0.50000,87.00000,74.00000, 80.50000
83168,2014,9,24,0.35000,82.00000,73.00000, 77.50000
83168,2014,9,25,0.60000,85.00000,75.00000, 80.00000
83168,2014,9,26,0.76000,89.00000,77.00000, 83.00000
83168,2014,9,27,0.00000,89.00000,79.00000, 84.00000
83168,2014,9,28,0.00000,90.00000,81.00000, 85.50000
83168,2014,9,29,0.00000,90.00000,79.00000, 84.50000
83168,2014,9,30,0.50000,89.00000,75.00000, 82.00000
83168,2014,10,1,0.02000,91.00000,75.00000, 83.00000
83168,2014,10,2,0.03000,93.00000,77.00000, 85.00000
83168,2014,10,3,1.40000,93.00000,75.00000, 84.00000
83168,2014,10,4,0.06000,89.00000,75.00000, 82.00000
83168,2014,10,5,0.22000,91.00000,68.00000, 79.50000
83168,2014,10,6,0.00000,84.00000,68.00000, 76.00000
83168,2014,10,7,0.17000,85.00000,73.00000, 79.00000
83168,2014,10,8,0.06000,84.00000,73.00000, 78.50000
83168,2014,10,9,0.00000,87.00000,73.00000, 80.00000
83168,2014,10,10,0.00000,88.00000,80.00000, 84.00000
83168,2014,10,11,0.00000,87.00000,80.00000, 83.50000
83168,2014,10,12,0.00000,88.00000,80.00000, 84.00000
83168,2014,10,13,0.00000,88.00000,81.00000, 84.50000
83168,2014,10,14,0.04000,88.00000,77.00000, 82.50000
83168,2014,10,15,0.00000,88.00000,77.00000, 82.50000
83168,2014,10,16,0.09000,89.00000,72.00000, 80.50000
83168,2014,10,17,0.00000,85.00000,67.00000, 76.00000
83168,2014,10,18,0.00000,84.00000,65.00000, 74.50000
83168,2014,10,19,0.00000,84.00000,65.00000, 74.50000
83168,2014,10,20,0.00000,85.00000,69.00000, 77.00000
83168,2014,10,21,0.77000,87.00000,76.00000, 81.50000
83168,2014,10,22,0.69000,81.00000,71.00000, 76.00000
83168,2014,10,23,0.31000,82.00000,72.00000, 77.00000
83168,2014,10,24,0.71000,79.00000,73.00000, 76.00000
83168,2014,10,25,0.00000,81.00000,68.00000, 74.50000
83168,2014,10,26,0.00000,82.00000,67.00000, 74.50000
83168,2014,10,27,0.00000,83.00000,64.00000, 73.50000
83168,2014,10,28,0.00000,83.00000,66.00000, 74.50000
83168,2014,10,29,0.03000,86.00000,76.00000, 81.00000
83168,2014,10,30,0.00000,85.00000,69.00000, 77.00000
83168,2014,10,31,0.00000,85.00000,69.00000, 77.00000
83168,2014,11,1,0.00000,86.00000,59.00000, 72.50000
83168,2014,11,2,0.00000,77.00000,52.00000, 64.50000
83168,2014,11,3,0.00000,70.00000,52.00000, 61.00000
83168,2014,11,4,0.00000,77.00000,59.00000, 68.00000
83168,2014,11,5,0.02000,79.00000,73.00000, 76.00000
83168,2014,11,6,0.02000,82.00000,75.00000, 78.50000
83168,2014,11,7,0.00000,83.00000,66.00000, 74.50000
83168,2014,11,8,0.00000,84.00000,65.00000, 74.50000
83168,2014,11,9,0.00000,84.00000,65.00000, 74.50000
83168,2014,11,10,1.20000,72.00000,65.00000, 68.50000
83168,2014,11,11,0.08000,77.00000,61.00000, 69.00000
83168,2014,11,12,0.00000,80.00000,61.00000, 70.50000
83168,2014,11,13,0.00000,83.00000,63.00000, 73.00000
83168,2014,11,14,0.00000,83.00000,65.00000, 74.00000
83168,2014,11,15,0.00000,82.00000,64.00000, 73.00000
83168,2014,11,16,0.00000,83.00000,64.00000, 73.50000
83168,2014,11,17,0.07000,84.00000,64.00000, 74.00000
83168,2014,11,18,0.00000,86.00000,71.00000, 78.50000
83168,2014,11,19,0.57000,78.00000,55.00000, 66.50000
83168,2014,11,20,0.05000,72.00000,56.00000, 64.00000
83168,2014,11,21,0.05000,77.00000,63.00000, 70.00000
83168,2014,11,22,0.22000,77.00000,69.00000, 73.00000
83168,2014,11,23,0.06000,79.00000,76.00000, 77.50000
83168,2014,11,24,0.02000,84.00000,78.00000, 81.00000
83168,2014,11,25,0.00000,86.00000,78.00000, 82.00000
83168,2014,11,26,0.07000,85.00000,77.00000, 81.00000
83168,2014,11,27,0.21000,82.00000,55.00000, 68.50000
83168,2014,11,28,0.00000,73.00000,53.00000, 63.00000
83168,2015,1,8,0.00000,80.00000,57.00000,
83168,2015,1,9,0.05000,72.00000,56.00000,
83168,2015,1,10,0.00000,72.00000,57.00000,
83168,2015,1,11,0.00000,80.00000,57.00000,
83168,2015,1,12,0.05000,80.00000,59.00000,
83168,2015,1,13,0.85000,81.00000,69.00000,
83168,2015,1,14,0.05000,81.00000,68.00000,
83168,2015,1,15,0.00000,81.00000,64.00000,
83168,2015,1,16,0.00000,78.00000,63.00000,
83168,2015,1,17,0.00000,73.00000,55.00000,
83168,2015,1,18,0.00000,76.00000,55.00000,
83168,2015,1,19,0.00000,78.00000,55.00000,
83168,2015,1,20,0.00000,75.00000,56.00000,
83168,2015,1,21,0.02000,73.00000,65.00000,
83168,2015,1,22,0.00000,80.00000,64.00000,
83168,2015,1,23,0.00000,80.00000,71.00000,
83168,2015,1,24,0.00000,79.00000,72.00000,
83168,2015,1,25,0.00000,79.00000,49.00000,
83168,2015,1,26,0.00000,79.00000,49.00000,
83168,2015,1,27,0.10000,75.00000,53.00000,
83168,2015,1,28,0.00000,68.00000,53.00000,
83168,2015,1,29,0.00000,69.00000,53.00000,
83168,2015,1,30,0.00000,72.00000,60.00000,
83168,2015,1,31,0.00000,76.00000,58.00000,
83168,2015,2,1,0.00000,76.00000,58.00000,
83168,2015,2,2,0.05000,77.00000,58.00000,
83168,2015,2,3,0.00000,84.00000,56.00000,
83168,2015,2,4,0.00000,76.00000,56.00000,

我无法纠正该错误。如何克服第26行中的警告?如何解决错误?请帮忙!预先感谢!

更新: 当我以不同的方式尝试相同的事情时,例如从文件中读取数据集而不是转换为数据帧,我没有得到错误。那是什么原因呢?代码是:

    weather_filename = 'wether.csv'
weather_file = open(weather_filename)
weather_data = weather_file.read()
weather_file.close()

# Break the weather records into lines
lines = weather_data.split('\n')
labels = lines[0]
values = lines[1:]
n_values = len(values)

# Break the list of comma-separated value strings
# into lists of values.
year = []
month = []
day = []
max_temp = []
j_year = 1
j_month = 2
j_day = 3
j_max_temp = 5

for i_row in range(n_values):
    split_values = values[i_row].split(',')
    if len(split_values) >= j_max_temp:
        year.append(int(split_values[j_year]))
        month.append(int(split_values[j_month]))
        day.append(int(split_values[j_day]))
        max_temp.append(float(split_values[j_max_temp]))

# Isolate the recent data.
i_mid = len(max_temp) // 2
temps = np.array(max_temp[i_mid:])
year = year[i_mid:]
month = month[i_mid:]
day = day[i_mid:]
temps[np.where(temps == -99.9)] = np.nan

# Remove all the nans.
# Trim both ends and fill nans in the middle.
# Find the first non-nan.
i_start = np.where(np.logical_not(np.isnan(temps)))[0][0]
temps = temps[i_start:]
year = year[i_start:]
month = month[i_start:]
day = day[i_start:]
i_nans = np.where(np.isnan(temps))[0]
print(i_nans)

第一个代码有什么问题,为什么第二个甚至没有给出警告?请帮忙!

3 个答案:

答案 0 :(得分:0)

dtype中的temps是什么。我可以使用字符串dtype重现您的警告和错误:

In [26]: temps = np.array([1,2,'string',0])
In [27]: temps
Out[27]: array(['1', '2', 'string', '0'], dtype='<U21')
In [28]: temps==-99.9
/usr/local/bin/ipython3:1: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  #!/usr/bin/python3
Out[28]: False
In [29]: np.isnan(temps)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-2ff7754ed926> in <module>()
----> 1 np.isnan(temps)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

首先,将字符串与数字进行比较会发出此未来警告。

第二,测试nan会产生错误。

请注意,在给定dtype的情况下,nan分配分配的是字符串值,而不是浮点数(np.nan是浮点数)。

In [30]: temps[-1] = np.nan
In [31]: temps
Out[31]: array(['1', '2', 'string', 'nan'], dtype='<U21')

答案 1 :(得分:0)

注意:此答案与问题的标题有些相关,因为在使用十进制类型时会提示此错误。

在考虑Decimal类型值时,我遇到了相同的错误。由于某种原因,我正在考虑的dataframe的一列为小数。例如,在此列上调用.unique()

[Decimal('0'), Decimal('95'), Decimal('38'), Decimal('25'),
 Decimal('42'), Decimal('11'), Decimal('18'), Decimal('22'), 
.....Decimal('220'), Decimal('724')]

该错误的回溯表明我在调用某些numpy函数时失败。我设法通过考虑上述数组的minmax值来重现错误

from decimal import Decimal

xmin, xmax = Decimal('0'), Decimal('724')
np.isnan([xmin, xmax])

它将提示错误

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

在这种情况下,解决方案是将所有这些值都转换为int

df.astype({col:int for col in desired_columns_to_convert})

答案 2 :(得分:0)

isnan(ndarray) 在“对象”的 ndarray 数据类型上失败

isnan(ndarray.astype(np.float)),但字符串不能被强制浮动。