熊猫数据框的最小/最大范围

时间:2020-10-01 08:06:04

标签: python pandas numpy dataframe

在此先感谢您的帮助! (下面的代码)/此处的数据:Link

我正尝试在我的数据框中添加另外两个列,以表示Topsoil列的数据范围,就像mean ['maxx20'] = maxx ['20 cm']和mean ['minn20'] = minn ['20 cm']做20厘米色谱柱。

我尝试通过添加以下内容来实现:

mean['topsoilMax']=maxx['Topsoil']
mean['topsoilMin']=minn['Topsoil']

而不是像我希望的那样添加其他列,这会导致 KeyError:'Topsoil',即使Topsoil已经是数据框中的一列,就像我添加20cm时一样范围。

为什么会出现此错误?添加这些列的正确方法是什么?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

#Importing data, creating a copy, and assigning it to a variable
raw_data = pd.read_csv('all-deep-soil-temperatures.csv', index_col=1, parse_dates=True)
df_all_stations = raw_data.copy()

#Setting the program to iterate based off of the station of the users choice
selected_soil_station = 'Minot'
df_selected_station = df_all_stations[df_all_stations['Station'] == selected_soil_station]
df_selected_station.fillna(method = 'ffill', inplace=True);

# Indexes the data by day and creates a column that keeps track of the day
df_selected_station_D=df_selected_station.resample(rule='D').mean()
df_selected_station_D['Day'] = df_selected_station_D.index.dayofyear


#Assigning variable so that mean represents df_selected_station_D but indexed by day
mean=df_selected_station_D.groupby(by='Day').mean()
mean['Day']=mean.index

#This inserts a new column named 'Topsoil' at the end that represents the average between 5 cm, 10 cm, and 20 cm
mean['Topsoil']=mean[['5 cm', '10 cm','20 cm']].mean(axis=1)


#Creating the range in which the line graph will fill in 
maxx=df_selected_station_D.groupby(by='Day').max()
minn=df_selected_station_D.groupby(by='Day').min()

mean['maxx20']=maxx['20 cm']
mean['minn20']=minn['20 cm']

enter image description here enter image description here

2 个答案:

答案 0 :(得分:1)

enter image description here如果我了解您的问题,那么解决问题的方法就是我,

表土= [-2.971686,-2.599278,-2.264897,-2.083117,-1.946969]

max_number = max(表土) min_number =分钟(表土) print(max_number)#在此获取表土列表的最大数量 print(min_number)#在此获取表土列表的最小编号 print(max_number-min_number)#在此获取表土列表的最大-min数

解决方法

答案 1 :(得分:1)

可能需要在maxx和minn数据帧中添加“ Topsoil”列:

maxx['Topsoil']=maxx[['5 cm', '10 cm','20 cm']].max(axis=1)
minn['Topsoil']=minn[['5 cm', '10 cm','20 cm']].min(axis=1)

该作业成功后:

mean['topsoilMax']=maxx['Topsoil']
mean['topsoilMin']=minn['Topsoil']