我正在加载包含数据的本地csv文件。我试图找到NaN和数字混合的最小浮点数 我尝试使用名为np.nanmin的numpy函数,但它抛出:
" TypeError:'< =' ' str'的实例之间不支持并且'浮动'"
database = pd.read_csv('database.csv',quotechar='"',skipinitialspace=True, delimiter=',')
coun_weight = database[['Country of Operator/Owner', 'Launch Mass (Kilograms)']]
print(coun_weight)
lightest = np.nanmin(coun_weight['Launch Mass (Kilograms)'])
为什么nanmin可能不起作用的任何建议?先谢谢!
指向整个csv文件的链接:http://www.sharecsv.com/s/5aea6381d1debf75723a45aacd40abf8/database.csv
以下是我的coun_weight的样本:
Country of Operator/Owner Launch Mass (Kilograms)
1390 China NaN
1391 China 1040
1392 China 1040
1393 China 2700
1394 China 2700
1395 China 1800
1396 China 2700
1397 China NaN
1398 China NaN
1399 China NaN
1400 China NaN
1401 India 92
1402 Russia 45
1403 South Africa 1
1404 China NaN
1405 China 4
1406 China 4
1407 China 12
答案 0 :(得分:1)
尝试将列转换为float显式显示问题,您有“5,000+”但未转换为“float64”。
coun_weight['Launch Mass (Kilograms)'].astype('float64')
结果:
ValueError: invalid literal for float(): 5,000+
答案 1 :(得分:1)
我尝试测试它,所有有问题的值都是:
coun_weight = pd.read_csv('database.csv')
print (coun_weight.loc[pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce').isnull(), 'Launch Mass (Kilograms)'].dropna())
1091 5,000+
1092 5,000+
1093 5,000+
1094 5,000+
1096 5,000+
Name: Launch Mass (Kilograms), dtype: object
解决方案是:
coun_weight['Launch Mass (Kilograms)'] =
coun_weight['Launch Mass (Kilograms)'].replace('5,000+', 5000).astype(float)
print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098])
1091 5000.0
1092 5000.0
1093 5000.0
1094 5000.0
1095 NaN
1096 5000.0
1097 6500.0
Name: Launch Mass (Kilograms), dtype: float64
然后,如果需要找到NaN
s - Series.min
的最小值,则跳过NaN
:
print (coun_weight['Launch Mass (Kilograms)'].min())
0.0
测试某些0
是否在列中:
a = coun_weight['Launch Mass (Kilograms)']
print (a[a == 0])
912 0.0
Name: Launch Mass (Kilograms), dtype: float64
另一种可能的解决方案是将此值替换为NaN
s:
coun_weight['Launch Mass (Kilograms)'] =
pd.to_numeric(coun_weight['Launch Mass (Kilograms)'], errors='coerce')
print (coun_weight['Launch Mass (Kilograms)'].iloc[1091:1098])
1091 NaN
1092 NaN
1093 NaN
1094 NaN
1095 NaN
1096 NaN
1097 6500.0
Name: Launch Mass (Kilograms), dtype: float64