Question

我正在使用包含5年的交通崩溃日期的csv。我想估计每个月的平均崩溃次数。这是我的代码：

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.read_csv('other.csv')
df['Time'] = df['Crash_Date'].str[:-8] + ' ' + df['Crash_Time']
df['Time'] = pd.to_datetime(df['Time'])
df['Crash_Date'] = pd.to_datetime(df['Crash_Date'])
df = df[df.Crash_Date < '2018-01-01 00:00:00']
# Day_Number : Monday=0, Saturday=5, Sunday=6
df['Day_Number'] = df['Crash_Date'].dt.dayofweek
df = df[df.Sig_ID != 0]
#function to estimate the average crash number for each month
def month_crash(x):
    t = 0
    for date in df['Crash_Date']:
        if date.month == x:
            t = t + 1
            y = t/5
    return y
#create a fataframe to save result
month = []
newcrash = []
for i in range(1,13):
    month.append(i)
    newcrash.append(month_crash(i))

month_crash = pd.DataFrame(
    {'Month': month,
     'Crash': crash
    })

这是我的数据： enter image description here 但是，每次我运行此代码时，都会遇到“分配前引用本地变量'y'” 问题。我在此代码上尝试了其他崩溃数据集，效果很好。所以我不知道问题出在哪里。有人可以帮助我吗？非常感谢！

赋值之前引用的局部变量，在其他数据集上运行良好

0 个答案: