让 df 与列 first_name = input('PLEASE ENTER YOUR FIRST NAME: ')
last_name = input('PLEASE ENTER YOUR SURNAME: ')
print(f'Hello {first_name} {last_name}, before you enter.')
def age_det():
age = input(f'How old are you?')
converted_age = [int(age)]
for num in converted_age:
while num>=18:
print(f'Awesome, {first_name}, welcome to Blablabla')
num += 100
while num <= 18:
break
print(f'Sorry, {first_name}, but we require you to be at least 18 to enter Blablabla.')
num += 100
age_det()
#I want the code to stop here if the age entered is under 18
username = input('Before we start, please pick a username: ')
print(f'Woah! {username}, good choice!')
、year
、month
、all time
、daytime
、region
和 {{1 }} 包含若干年和地区的数据:
temp
我想获得一个新的 df,其中包含年份和地区作为列,并计算特定月份(第 6、7 和 8 个月)的温度和降水平均值:
precipitation
我尝试了以下代码:
year month alltime daytime region temp precipitation
2000 1 True False saint louis 21.3105241935484 0.03
2000 1 False True saint louis 22.7246627565982 0.025
2000 1 False False saint louis 20.0136559139785 0.012
2000 2 True False saint louis 22.1646408045977 0.013
2000 2 False True saint louis 23.557868338558 0.07
2000 2 False False saint louis 20.8678927203065 0.012
2000 3 True False saint louis 22.9311155913978 0.031
2000 3 False True saint louis 24.9204398826979 0.016
2000 3 False False saint louis 21.011541218638 0.0121
2000 4 True False saint louis 22.5921805555556 0.019
2000 4 False True saint louis 24.3710303030303 0.054
2000 4 False False saint louis 20.8877777777778 0.043
2000 5 True False saint louis 21.4352016129032 0.032
2000 5 False True saint louis 22.8382404692082 0.023
但是这返回了所有 12 个月的平均值:
year region temp precipitation
2000 saint louis 22.123 321.23
2000 diff region 24.643 673.12
2001 saint louis 21.433 134.27
答案 0 :(得分:1)
那么您所缺少的是您的数据子集,其中仅包含您需要的月份。所以,
您可以使用要包含的月份创建一个新数据框,然后使用 groupby.agg
:
months = ['6','7','8']
temp = df[df['month'].isin(months)]
res = (df.groupby(['region','year']).agg({'temp':'mean','precipitation':'mean'})).reset_index()
会给你:
region year temp precipitation
0 saint louis 2000 22.259055 0.028007
1 saint louis 2001 22.838240 0.023000
仅供参考,我在样本中添加了一些额外数据,因为在您发布的数据中,您只有 1 年和 1 个地区。
答案 1 :(得分:1)
使用 Boolean indexing、group by 和 MONTH
和 YEAR
在所需时间段内按 REGION
过滤您的数据框,然后汇总您的类别的平均值,例如 { {1}} 和 TEMP
:
PRECIP
示例输出:
import pandas as pd
#fake data generation
import numpy as np
np.random.seed(1234)
n=30
df = pd.DataFrame({"YEAR": np.random.choice([2000, 2001, 2003], n),
"MONTH": np.random.randint(4, 10, n),
"REGION": np.random.choice(["A", "B", "D"], n),
"TEMP": 20 + 10 * np.random.random(n),
"PRECIP": 200 + 100 * np.random.random(n),
"OTHER": np.random.randint(1, 100, n)})
weather = df.sort_values(["YEAR", "MONTH", "REGION"]).reset_index(drop=True)
#print(df)
new_df = weather[(6 <= weather["MONTH"]) & (weather["MONTH"] <= 8)].groupby(["YEAR", "REGION"])[["TEMP", "PRECIP"]].mean()
print(new_df)